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Abstract 

We consider slow fading relay channels with a single multi-antenna source-destination terminal pair. 
The source signal arrives at the destination via hops through TV — 1 layers of relays. We analyze the 
diversity of such channels with fixed network size at high SNR. In the clustered case where the relays 
within the same layer can have full cooperation, the cooperative decode-and-forward (DF) scheme is 
shown to be optimal in terms of the diversity-multiplexing tradeoff (DMT). The upper bound on the 
DMT, the cut-set bound, is attained. In the non-clustered case, we show that the naive amplify-and- 
forward (AF) scheme has the maximum multiplexing gain of the channel but is suboptimal in diversity, 
as compared to the cut-set bound. To improve the diversity, space-time relay processing is introduced 
through the parallel partition of the multihop channel. The idea is to let the source signal go through 
K different "AF paths" in the multihop channel. This parallel AF scheme creates a parallel channel in 
the time domain and has the maximum diversity if the partition is properly designed. Since this scheme 
does not achieve the maximum multiplexing gain in general, we propose a flip-and-forward (FF) scheme 
that is built from the parallel AF scheme. It is shown that the FF scheme achieves both the maximum 
diversity and multiplexing gains in a distributed multihop channel of arbitrary size. In order to realize the 
DMT promised by the relaying strategies, approximately universal coding schemes are also proposed. 

Index Terms 

Relay channel, multiple-input multiple-output (MIMO), multihop, diversity-multiplexing tradeoff (DMT), 
amplify-and-forward (AF). 
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Diversity of MIMO Multihop Relay Channels 

I. Introduction 

Recent years have seen a surge of interest in wireless networks. Unlike the traditional point-to- 
point communication, elementary modes of cooperation such as relaying are needed to improve 
both the throughput and reliability in a wireless network. Although capacity of a relay channel [1], 
[2] is still unknown in general, considerable progress has been made on several aspects, including 
some achievable capacity results [3], [4] and capacity scaling laws of large networks [5]-[9]. In 
parallel, research on the cooperative diversity [10], [11], where the relays help the source exploit 
the spatial diversity of a slow fading channel in a distributed fashion, has attracted significant 
attention [12]-[18]. 

In small relay networks where the source signal can reach the destination terminal via a 
direct link, many results have been known in both the channel capacity [2], [3] and the co- 
operative diversity. The capacity results are mostly based on the decode-and-forward (DF) and 
the compress-and-forward (CF) strategies. The amplify-and-forward (AF) scheme, however, is 
rarely considered in this scenario due to the noise accumulation at the relays. On the other hand, 
the AF scheme is widely used for cooperative diversity. It has been shown in [13], [15] that 
the AF scheme is as good as the DF scheme at high SNR as far as the diversity is concerned. 
Furthermore, it is pointed out in [17] that not needing to decode the source signal makes the relays 
more capable of protecting the source signal in some cases. The CF scheme, which works with 
perfect global channel state information (CSI), is usually excluded in the cooperative diversity 
scenario for practical considerations. In larger relay networks, where direct source-destination 
links are generally absent, substantial results on the capacity scaling laws have been obtained in 
the large network size regime [5]-[7], [9] . However, much less is known about the cooperative 
diversity than in the case of small networks. 

This paper analyzes the cooperative diversity in relay networks with a single multi-antenna 
source-destination terminal pair. The source signal arrives at the destination via a sequence of N 
hops through — 1 layers of relays. Similar channel setting with a single layer has been studied 
in [19]-[21] in different contexts. Using large random matrix theory, the ergodic capacity results 
of some particular relaying schemes have been established for large networks [19]. Recently, 
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the study has been extended to the case with multiple layers of relays [22] and the case with 
multiple source-destination pairs [8]. Cooperative diversity in this setting was first studied in [20] 
for the single-antenna case then in [21] for the multi-antenna case, with distributed space-time 
coding. All the mentioned works assume linear processing at the relays and the DF scheme is 
not considered. Actually, one can figure out immediately that the DF scheme is not suitable for 
the multi-antenna setting due to the suboptimality in terms of degrees of freedom. Requiring 
the relays to decode the source signal restricts the achievable degrees of freedom. This is one 
of the fundamental differences between the large networks and small networks : the degrees 
of freedom of the latter are determined by the source-destination link and not by the relaying 
strategy. 

In this work, we suppose that the network size is arbitrary (but fixed) and the signal-to-noise 
ratio (SNR) is large. The multihop channel is investigated in terms of the diversity-multiplexing 
tradeoff (DMT). The DMT was introduced in [23] for the point-to-point multi-antenna (MIMO) 
channels to capture the fundamental tradeoff between the throughput and reliability in a slow 
fading channel at high SNR. It was then extensively used in multiuser channels such as the 
multiple access channels [24] and the relay channels [12], [13], [16]-[18] as performance measure 
and design criterion of different schemes. Our main contributions are summarized in the following 
paragraphs. 

First, we use the information theoretic cut-set bound [25] to derive an upper bound on the 
DMT of any relaying strategy. In the clustered case where the relays in the same layer can fully 
cooperate, this bound is shown to be tight. An optimal scheme is the cooperative DF scheme, 
where the clustered relays perform joint decoding and joint re-encoding. 

While the clustered channel is equivalent to a series-channel and does not feature the distributed 
nature of wireless networks, the non-clustered case is studied as the main focus of the paper. 
Since no within-layer cooperation is considered, linear processing at the relays is assumed. We 
start by the AF strategy, which seems to be the natural first choice as a linear relaying scheme. 
We show that the AF scheme is, in the DMT sense, equivalent to the Rayleigh product (RP) 
channel, a point-to-point channel whose channel matrix is defined by a product of Gaussian 
matrices. That being said, we examine the RP channel in great detail. It turns out that the DMT 
of a RP channel has a nice recursive structure and lends some intuitive insights into the typical 
outage events in such channels. The study of the RP channel leads directly to an exact DMT 
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characterization for the AF scheme in multihop channels of arbitrary size. The closed-form DMT 
provides simple guidelines on how to efficiently use the available relays with the AF scheme. 
One such example is how to reduce the number of relays while keeping the same diversity. While 
the maximum multiplexing gain is achieved, the achievable diversity gain of the AF scheme can 
be far from maximum diversity gain suggested by the cut-set bound. Specifically, the DMT of 
the AF scheme is limited by a virtual "bottleneck" channel. 

The following question is then raised : is the DMT cut-set bound tight in the non-clustered 
case? The question is partially answered in this work : there exists a scheme that achieves both 
extremes of the cut-set bound, that is, the maximum diversity extreme and maximum multiplexing 
extreme. In order to achieve the maximum diversity gain, the key is space-time relay processing. 
Noting that the AF scheme is space-only, we incorporate the temporal processing into the AF 
scheme. The first scheme that we propose is the parallel AF scheme. By partitioning the multihop 
channel into K "AF paths", we create a set of K parallel sub-channels in the time domain. A 
packet that goes through the parallel channel attains an improved diversity if the partition is 
properly designed. It is shown that there is at least one partition such that the maximum diversity 
is achieved. However, the parallel AF scheme does not have the maximum multiplexing gain in 
general, since the achievable degrees of freedom by the scheme are restricted by those of the 
individual AF paths. In most cases, the AF paths are not as "wide" as the original channel in 
terms of the degrees of freedom. In order to overcome the loss of degrees of freedom, we linearly 
transform the set of parallel AF channels into another set in which each sub-channel has the 
same degrees of freedom as the multihop channel. In the new parallel channel, each relay only 
need to flip the received signal in a pre-assigned mode, hence the name flip-and-forward (FF). 
It is shown that the FF scheme achieves both the maximum diversity and multiplexing gains. 
Furthermore, the DMT of the FF scheme is lower-bounded by that of the AF scheme. 

Using the results obtained in the non-clustered case, we revisit the clustered case by pointing 
out that the cooperative DF operation might not be needed in all clusters to get the maximum 
diversity. We also indicate that cross-antenna linear processing in each cluster helps to improve 
the DMT only when both transmitter CSI and receiver CSI are known to the relays. 

Finally, coding schemes are proposed for all the studied relaying strategies. In the clustered 
case, a series of Perfect space-time block codes (STBCs) [26], [27] with appropriate rates and 
dimensions are used at the source and each relay cluster that performs the cooperative DF 



February 1, 2008 



DRAFT 



4 



operation. In the non-clustered case, construction of Perfect STBCs for general parallel MIMO 
channels is first provided. The constructed codes can be applied directly to the parallel AF 
scheme and the FF scheme. All suggested coding schemes achieve the DMT despite of the 
fading statistics and are thus approximately universal [28]. 

Regarding the notations, we use boldface lower case letters v to denote vectors, boldface 
capital letters M to denote matrices. CJ\f{fi, cr^) represents a complex Gaussian random variable 
with mean yU and variance a^. ¥,[■] stands for the expectation operator. [-Y, [-f respectively denote 
the matrix transposition and conjugated transposition operations. ||-|| is the vector norm. ||-||p is 
the Frobenius matrix norm. We define YliLi — ■ ■ ■ Mi for any matrices Mj's. The square 
root of a positive semi-definite matrix P is defined as a positive semi-definite matrix such 
that P = \/P(^\/Py. Amax(-P) and Amin(-P) denote respectively the maximum and minimum 
eigenvalues of a semi-definite matrix P. (x)^ means max(0, x). \a] (respectively, [aj) is the 
closest integer that is not smaller (respectively, not larger) than a. (a)), means a mod b. log(-) 
stands for the base-2 logarithm. For any quantity q, 

log q 

Q = SNR" means lim — — -— = a 
SNR-»oo logSNR 

and similarly for < and > . The tilde notation h is used to denote the (increasingly) ordered 
version of n. Let m and n be two vectors of respective length Lm and L„, then m ^ n means 
rhi < hi,\/ i = 1, . . . , minlL^, — 1. m C n means thatm is a sub-vector of some permutated 
version of n. 

The rest of this paper is organized as follows. Section |n] describes the system model and 
some basic assumptions in our work. The DMT cut-set bound and the clustered case with the 
DF scheme are presented. In Section |llll we study the non-clustered case with the AF scheme. 
The parallel AF and the FF schemes are proposed in Section |IVl In section |Vl the clustered case 
is revisited. The approximately universal coding schemes are proposed in Section |Vll Section rvlll 
provides some selected numerical examples. Finally, a brief conclusion is drawn in Section IVIIII 
Most detailed proofs are deferred to the appendices. 
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II. System Model and Basic Assumptions 

A. Channel Model 

The considered A^-hop relay channel model is composed of one source (layer 0), one destina- 
tion (layer A^), and A^— 1 layers of relays (layer 1 to layer A^ — 1). Each terminal is equipped with 
multiple antennas. The total number of antennas in layer i is denoted by nj. For convenience, we 
define rij = uq, n,. = nj^, and rimin = ^^^i=o,...,N ^i. We assume that the source signal arrives at 
the destination via a sequence of A^ hops through the A^ — 1 layers and that terminals in layer i 
can only receive the signal from layer i — 1. The fading sub-channel between layer i — 1 and 
layer i is denoted by the matrix Hi. Sub-channels are assumed to be mutually independent, flat 
Rayleigh-fading and quasi-static. That is, the channel coefficients are independent and identically 
distributed (i.i.d.) complex circular symmetric Gaussian with unit variance. And they remain 
constant during a coherence interval of length L and change independently from one coherence 
interval to another. Furthermore, the transmission is supposed to be perfectly synchronized. Under 
these assumptions, the signal model within a coherence interval can be written as 

yS]=HiX^-i[l]+Zi[l], / = 1,...L, 

where ajj [^] , ?/i [/] G C"^^^ denote the transmitted and received signal at layer i; Zi[l] G C"'^^ is the 
additive white Gaussian noise (AWGN) at layer i with i.i.d. CJ\f{0, 1) entries. Since we consider 
the non-ergodic case where the coherence time interval L is large enough, we drop the time 
index / hereafter. It is assumed that all relays work in fuU-duple^u mode and the transmission 
is subject to the short-term power constraint 

E{||a;,||J} <SNR, Vi (1) 

with SNR being the average transmitted SNR per layer. All terminals are supposed to have 
perfect channel state information at the receive^ and no CSI at the transmitter. From now on, 
we denote the channel as a (no, ni, . . . , un) multihop channel. 

'This assumption is merely for simplicity of notation. Since we assume that no cross-talk exists between different channels, 
the half-duplex constraint is directly translated to a reduction of degrees of freedom by a factor of two and does not impact the 
relaying strategy. This is achieved by letting all even-numbered (respectively, odd-numbered) nodes transmit (respective, receive) 
in even-numbered time slot and received (respective, transmit) in odd-numbered time slots. 

^As we will see, assuming no CSI at all at the relays will not change the results of our work. 
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B. Diversity-Multiplexing Tradeoff 

Slow fading channels are outage-limited, i.e., there is an outage probability Pout(SNR, R) that 
the channel cannot support a target data rate of R bits per channel use at signal-to-noise ratio 
SNR. In the high SNR regime, this fundamental interplay between throughput and reliability is 
characterized by the diversity-multiplexing tradeoff [23]. 

Definition 2.1: The multiplexing gain r and diversity gain d of a fading channel are defined 

,4 ,i,n ^ and d*- lim '"^^'fl^-^'. 
sNR-»oo logSNR sNR->oo log SNR 

A more compact form is 

Pout(SNR,rlogSNR) - SNR-'^^''^. (2) 

Note that in the definition we use the outage probability instead of the error probability, since 
it is shown in [23] that the error probability of any particular coding scheme with maximum 
likelihood (ML) decoding is dominated by the outage probability at high SNR and that the thus 
defined DMT is the best that one can achieve with any coding scheme. In the Rayleigh MIMO 
channel, the DMT has the following closed form. 

Lemma 2.1 ( [23]): The DMT of a nt x rij- Rayleigh channel is a piecewise-linear function 
connecting the points {k, d{k)), k — 0,1, ... , min (rit, rir), where 

d{k) = {n,-k){n,-k). (3) 

In the following, we will use the DMT as our performance measure. For convenience of 
presentation, we provide the following definition. 

Definition 2.2: Two channels are said to be DMT-equivalent or equivalent if they have the 
same DMT. 

C. Upper Bound on the DMT 

Before studying any specific relaying strategy, we establish an upper bound on the DMT of 
the multihop system as a benchmark. 

Proposition 2.1 (Cut-set bound): For any relaying strategy T, we have 

d^{r) < d{r) 
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with 

d{r) = min^ di{r), (4) 

where di{r) is the DMT of the point-to-point channel between layer i — 1 and layer i. In 
particular, by defining the maximum diversity gain and multiplexing gain as c/max — '^(0) and 
'^max — sup{(i(r) > 0}, respectively, we have 

(imax = min rii^irii, and (5) 

i=l,...,N 

r^ax = min rij. (6) 

i=0,...,N 

Proof: From the information theoretic cut-set bound [25], the mutual information between 
the source and the destination satisfies 

Ir{xo]y^\Hi, . . .,Hn) < Vz, 

for any relaying strategy T. Thus, the outage probability using a relaying scheme T is 

^out(^) = P{Hj, {lTixo;yMH,, ...,H^)<R} 
> maxPn, {I{Xi^i;yi\Hi) < R} 

= maxPouM(-R), (V) 

i 

where Pout,t(-R) the outage probability of the ith sub-channel. From ^ and (|7]), we prove dH). 
Finally, ([5]) and ^ are from the direct application of Lemma 12.11 □ 

D. The Clustered Case and Decode-and-Forward 

If we assume that the relays within the same layer are clustered, i.e., they can perform joint 
decoding and joint re-coding operations, then each layer can act as a virtual multi-antenna 
terminal. This could happen either when the relays are controlled by a central unit via wired 
links or when they are close enough to each other to exchange information perfectly. In this 
case, the relay channel model is equivalent to a serial concatenation of N independent MIMO 
channels. Let us consider the following cooperative decode-and-forward scheme. Each layer 
tries to cooperatively decode the received signal. When a successful decoding is assumed, the 
embedded message is re-encoded and then forwarded to the next layer. We can show that this 
simple scheme is DMT optimal. 
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Proposition 2.2: When the relays are clustered, the cooperative DF scheme achieves the DMT 
cut- set bound d{r) defined in @i. 

Proof: To show the achievability, note that the cooperative DF scheme being in outage 
implies the outage of at least one of the sub-channels. By the union bound, 

N 

1=1 

At high SNR, the probability is dominated by the largest term in the sum of the right-hand 
side (RHS). From 0, we get 

(P^{r) > min (ij(r) = d{r). 

□ 

In the high SNR regime, the union bound defined by the sum operation coincides in the SNR 
exponent with the cut-set bound defined by the minimum operation. Hence, the DMT cut-set 
bound is tight in the clustered case. However, relays in wireless networks are not clustered 
in general. In fact, one of the important and interesting attributes of wireless networks is the 
distributed nature. In the following two sections, we will concentrate on the non-clustered case 
and analyze the achievable DMT. 

III. Amplify-and-Forward 

In this section, we consider the non-clustered case, where the relays work in a distributed 
manner and no within-layer communication is allowed. In this case, applying the DF scheme at 
each individual relay might incur loss of degrees of freedom. To see this, take the single-layer 
channel as an example. In the best case where all the relays succeed in decoding, they transmit the 
message using a pre-assigned codebook. This scheme transforms the relays-destination channel 
into a rii X n2 virtual MIMO channel. Before this could possibly happen, however, the success 
decoding at the relays must be guaranteed with high probability. This constraint imposes that 
the degrees of freedom in this scheme must not be larger than minjt{rai with rii ^ being the 
number of antennas at the A;th relay. While this scheme achieves the maximum multiplexing 
gain in the single-antenna case, it could fail in the multi-antenna case. 

Since we do not know how to cooperate efficiently in this case, we start by the most obvious 
and naivest relaying scheme : the amplify-and-forward scheme. This scheme in the considered 
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setting has been studied in [19], [22] for the capacity scaling laws, and in [29] for the DMT. 
It is worth noting^ that, in [29], a lower bound on the DMT of the AF scheme in a symmetric 
network {Ui = n,\/ i) was obtained, while our work derives the exact DMT for a network of 
arbitrary dimension with a different approach. 

A. Signal Model 

In the considered AF scheme, each antenna node normalizes the received signal to the same 
power level and then retransmits it. This linear operation can be expressed as 

Xi = Diyi, i = 1, . . . ,iV - 1, 

where, by the power constraint ([T]), 

the scaling matrix Di E C"'^"' is diagonal due to the antenna- wise nature of the relaying scheme, 
with the normalization factors'* 



1 /SNR 



S(Et7|H.UA:)n + l 



rii 



Thus, the signal model of the end-to-end channel is 

(N \ N / N \ 

l[D,H, Uo + 5^ l[H,+,D, z„ (9) 
i=i J j=i \i=j J 

where, for the sake of simplicity, we defined Hn+i = I and £>Ar = I. The whitened form of 
this channel is 

y = ^(j^D.H^j xo + z, 

where z is the whitened noise and is the whitening matrix with being the covariance 
matrix of the noise in Q. Since it can be shown that Amax(-R) = Amin(-R) = SNR°, R can be 
neglected in the DMT analysis and the AF channels is equivalent to the MIMO channel defined 
by the following matrix 

(10) 



The rest of the section is devoted to the DMT analysis of this channel. 



'The authors found [29] at the very end of the preparation for this manuscript. 



''in the case where long-term power constraint is imposed, we simply replace the channel coefficients \Hi{j, k) \ in ^ by I's. 
^Here, with a slight abuse of terminology, we call the multihop channel with AF scheme an AF channel. 
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B. The Rayleigh Product Channel 

Definition 3.1: Let Hi G C"'-^^"^*, i = 1, 2, . . . , A^, be independent complex Gaussian 
matrices with i.i.d. CJ\f{0,l) entries. A (no, ni, . . . , n^v) Rayleigh product (RP) channel is a 
n^r X ^0 MIMO channel defined by 

y = J^^Ux + z, (11) 

V ni-'-riN 

where 11 = H1H2 ■ ■ ■ Hn is the channel matrix; x E C"^^^ is the transmitted signal with 
normalized power, i.e., E{||a;||^} = un; and y G C""^^ is the received signal; z G C""^^ ~ 
CAf (0,1) is the AWGN; SNR is the SNR per receive antenna, (no, ni, . . . , uat) is called the 
dimension of the channel and N is called the length of the channel. 

While this channel model has been studied in terms of the asymptotic eigenvalue distribution in 
the large dimension regime [30], we are particularly interested in the fixed dimension case in 
the high SNR regime. In this regime, we can define a more general RP channel as 

Hg = H1T12H2 ■ ■ ■ HN-lTN-l,NH]y. (12) 



Proposition 3.1: The general RP channel is equivalent to 

• a (no, rii, . . . , rijv) RP channel, if all the matrices Tj ^+i's are square and their singular 
values satisfy aj(Ti^i^i) = SNR°, Wi,j; 

• a (nQ,n[, . . . , ?t,'^__^, njv) RP channel, with n^ being the rank of the matrix T,/ j+i, if the 
matrices Tj j+i's are constant. 

Proof: See Appendix III-CI □ 
Hence, we can consider the RP channel from Definition 13.11 without loss of generality. 

1) Direct Characterization: Recall that n is the ordered version of n with > fiN-i > 
■ ■ ■ > no and n^in = ^o- 

Theorem 3.1: The DMT of a RP channel (no, ni, . . . , nAr) is a piecewise-linear function 
connecting the points (/c, d^{k)). A; = 0, 1, ... , nmin, where 

d^{k) =J2ci (13) 

i=k+l 

with 



Ci = 1 — i + min 

k=l,...,N 



Ek ~ 
1=0^1-^ 



1, . . . , nmin- (14) 
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Proof: The DMT depends on the "near zero" probability of the singular values of channel 
matrix. While this probability for the given product matrix is intractable, we can character- 
ize it by induction on the length N . The main idea is that, conditioned on a given product 
matrix H1H2 ■ ■ ■ Hj^^i, H1H2 ■ ■ ■ Hj^ is Gaussian whose singular distribution is tractable. See 
Appendix |II] for details. □ 
The following corollaries are given without proofs. 

Corollary 3.1 (Permutation invariance): The DMT of a RP channel depends only on the 
ordered dimension n. 

Corollary 3.2 (Monotonicity): The DMT is monotonic in the following senses : 

• if **i ^ ^2, then 

<(r)>CW' Vr; 

• if Wi ^ then 

Corollary 3.3 (Symmetric Rayleigh product channels): When tiq = . . . = njv = n, we have 

dl'ik) = + ^Mk) - 1)N + 2h{k)), (15) 

where a(k) = [^\ and b{k) = (n - k)N. 

2) DMT Equivalent Classes: Corollary 13.11 implies that RP channels with the same ordered 
dimension belong to the same DMT equivalent class. In the following, a precise characterization 
of the DMT class is obtained. Before that, we need the following definitions. 

Definition 3.2: A (mo, mi, . . . , rrik) RP channel is said to be a reduction of a (no, rii, . . . , wat) 
RP channel if 1) they are equivalent, 2) k < N, and 3) m ^ n. In particular, if A; = A^, then it 
is called a vertical reduction. Similarly, if mj = n^, Vi G [0, /c], it is a horizontal reduction. 

Definition 3.3: (no, ni, . . . , un*) is said to be a minimal form if no reduction other than itself 
exists. Similarly, it is called a minimal vertical form (respectively, minimal horizontal form) if 
no vertical (respectively, horizontal) reduction other than itself exists. A RP channel is said to 
have order N* if its minimal form is of length A^* + 1. 

Theorem 3.2: A (no, ni, . . . , un) RP channel can be reduced to a (n-o, ni, . . . , hk) channel if 
and only if 

k 

k{nk+i + l)>J2^i- (16) 
1=0 
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In particular, it can be reduced to a Rayleigh channel if and only if 



n2 + l>no + rii. 



(17) 



Proof: See Appendix IIII-AI □ 
Corollary 3.4: The channel order A^* is the minimum integer such that (fT6l) is satisfied. The 
minimal horizontal form is the minimal form {uq, hi, ... , h^*) and the minimal vertical form is 
{ho,hi, . . . ,hN*,h, . . . ,h) with 



- A 

n = 



N* 



1. 



(18) 



For instance, the minimal form of a (1, ni, . . . , n^) RP channel is (1, hi), i.e., a 1 x fii or x 1 
Rayleigh channel. 

Theorem 3.3: The DMT equivalent class is uniquely identified by the minimal form, i.e., two 
RP channels are equivalent if and only if they have the same minimal form. 

Proof: See Appendix IIII-BI □ 

3) Recursive Characterization: In order to interpret the closed-form DMT of Theorem 13. 1[ 
we derive an equivalent recursive form as shown in the following theorem. 
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Theorem 3.4: The DMT d^{k) defined in (fT3] ) can be alternatively characterized by 

Rf\k) : <„...,„,)(fc) = <„_,,...,„^_,)(0), VA:; (19) 
Rf\^) ■■ df:^_^jO) = min{df^^_^,^^^^^ W, (20) 

Rf\t, k) : <„...,„,)(fc) = min + Cn,..,...,n.)(^)} ' Vz, k. (21) 

Proo/.- See Appendix |IVl □ 
A new interpretation of the DMT is as follows. Let us consider a multi-layer network of dimension 
(no, rii, . . . , rij^i). Then, d^^{k) is the minimum "cost" to limit the "network flow" between the 
source and the destination to k (the flow-k event). In particular, the maximum diversity d^^if)) is 
the "disconnection cost". Now, we can apply the new interpretation to the results of Theorem l3.4[ 
First, Ri{k) says that the most efficient way to limit the flow to k is to keep a {k, k, . . . , k) 
channel fully connected and to disconnect the (no — k,ni ~ k, . . . ,niy — k) residual channel, as 



shown in Fig. |l(a)[ Then, R2{i) suggests that in order to disconnect a (no, ni, . . . , n^v) channel, 
if we allow for j flows from the source to some layer i, then the (j, nj+i, . . . , njy) channel from 
the j "ends" of the flows at layer i to the destination must be disconnected (Fig. |l(b)| ). Obviously, 
the most efficient way is such that the total cost is minimized with respect to j. Finally, the 
flow-/c event takes place when both the flow-j (j > k) event in the (rio, . . . , rij) channel and the 
flow-/c event in the (j, n^+i, . . . , n^) channel happen at the same time. We can easily verify that 
(-Ri(/c), Rz{i, k)) is equivalent to {Ri{k), R2{i)). Also note that R2{i) and Rz{i, k) hold for any 
layer i, which guarantees the coherence of the interpretation. 

The recursive characterization sheds lights on the typical outage event of the RP channel. In 
the trivial case of = 1 (the Rayleigh channel), the typical and only way for the channel to 
be in outage at multiplexing gain r approaching to zero is that all the no x ^i paths are bad, 
i.e., all channel gains are close to zero. And the disconnection cost is no x ni. In the non-trivial 
cases (A^ > 1) where channels are concatenated, there are several types of outage event. Each 
type is numbered by the index j in (l20l) and (|2T1) . The cost of the type-j event is given by 
ni)U) ^ '^fjrii+i njv)('^) ^ Certain j. Hence, the typical outage event is the one with the 
minimum cost and it does not necessarily happen when one of the sub-channels being totally 
bad (J = or j = rij). The mismatch of two partially bad sub-channels can also cause outage. 
This phenomenon will be detailed later on. 
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Fig. 2. Diversity-multiplexing tradeoff of 2 x 2 and 5x5 symmetric RP channels. 



C. DMT of the AF Scheme 

From the equivalent channel matrix (flOl ) and Proposition |3.1[ the AF channel is equivalent to 
a (njv, ?2Ar_i, . . . , no) RP channel @ Therefore, the DMT of the AF channel is 

1 ) Implications: From the results of Section IIII-B[ several interesting implications on the AF 
scheme with respect to the DMT are summarized below. 

• Interchanging layers does not influence the DMT. 

• The maximum diversity of the AF scheme is lower- and upper-bounded as 

no(ni + 1) . ,AF ^ ~ ~ .oox 
< <ax < ^orii (22) 

which is obtained via the monotonicity from Corollary 13. 2[ We have set h2> nQ + fii — 1 
for the upper bound and = fiN-i = . ■ . = ni for the lower bound. The upper bound 
shows that there exists a virtual uq x hi "bottleneck" channel that limits the AF scheme 
and that it is not necessarily one of the sub-channels. On the other hand, the lower bound is 
always strictly larger than half the upper bound and is independent of the number of hops 

^Theoretically, this is true only when the singular values ajiDi) = SNR", Vi,j. To this end, it is enough to modify the 
matrices as Di{j,j) = mm{Di{j,j), n} where < k < cx) is a constant independent of SNR. Note that the k is only for 
theoretical proof and is not used in practice, since we can always set k a very large constant but independent of SNR. 
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A*". In the symmetric case (Corollary 13.31) . we observe that the DMT degrades with only 
when N <n and that we have 

AF N _ {n-k){n + l-k) 

^{n,...,n){'^) — 2 

for N > n. The observation can also be deduced from theorem 13.21 applying which we 
infer that the order of any symmetric RP channel with > n is A^* = n. This non-trivial 
lower bound is somewhat anti-intuition, since it means that at this point introducing extra 
fading hops does not degrade the diversity any more. An example illustrating the DMT of 
the 2x2 and 5x5 RP channels of different lengths is in Fig.[2l 
• If one could increase the number of antennas at each relay layer without any constraint, then 
intuition tells us that the AF channel could be reduced to a rit x Uj- point-to-point Rayleigh 
MIMO channel and the diversity order is Ut n^. The relay layers "disappear". The intuition 
has been confirmed in [19] in the single-layer case with the capacity results. Here, the result 
in Theorem 13 .21 indicates that this happens when there are exactly rii+n^ — l antennas at each 
relay layers from the diversity point of view. Further increasing the number of antennas is 
not necessary in the DMT sense. On the other hand, if the number of available antennas is 
fixed, then Corollary 13 .4l provides. through the minimal vertical form, the minimum numbers 
of antennas at each layer to achieve the diversity that could be achieved when all antennas 
were used. In both cases, our results yield simple guidelines to minimize the number of 
relay antennas (also the number of relays in general) without loss of optimality of the DMT. 
In the same way, the number of transmit antennas at the source terminal can also be reduced 
to lower the coding complexity. A numerical example is given in Section IVIII 
2) Comparison to the Cut-Set Bound: A simple comparison between the DMT of the AF 
scheme and the cut-set bound dH) is carried out as follows. First, the AF scheme is multiplexing 
optimal and achieves the maximum multiplexing gain fiQ of the channel. Then, since 

{fiQ — k){ni — k) < min {(nj_i — A;)(nj — A;)}, VA;, 

j=l,...,7V 

the diversity upper bound is generally not achievable by the AF scheme for integer multiplexing 
gain k. In particular, the best diversity gain of the AF scheme is uqUi, while the upper bound 
is minjjnj rij+i}. Finally, for any non-integer multiplexing gain, say r G {k,k + 1), d{r) is 
minimum of linear functions and thus concave, while d^^{r) is linear. The comparison shows 
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(a) Canonical basis (b) Orthogonal basis {hi/ \\hi\\ ,hi /\\hi\\} 

Fig. 3. The (2, 2, 2) multihop channel in two different basis. 



that the bottleneck of the channel is always one of the hops (inter-layer sub-channels), while 
the bottleneck of the AF scheme is the virtual h^yi hi channel that does not correspond to any 
physical sub-channel in most cases. The following remark states the necessary and sufficient 
condition for the AF scheme to achieve the maximum diversity. 

Remark 3.1: The AF scheme achieves the diversity upper bound (imax if and only if it can be 
reduced to the bottleneck of the channel, i.e., 

minjrij., n^.+i} = fiO) maxjrij., -n-j.+i} = -n-i, and n2 + 1 > no + ^i? (23) 
where i* is such that rij rij+i is minimized. 

This condition is very stringent. It means that the two layers with minimum numbers of antennas 
must stand one next to the other and that the other layers must have a large number of antennas. 
Moreover, note that the AF scheme achieving the maximum diversity does not necessarily mean 
that it achieves d{r) for all r. 

3) Mismatch of Adjacent Sub-Channels: In order to achieve the diversity upper bound, in- 
tuitively, one should assure that the end-to-end channel is good if each sub-channel is good. 
However, this property does not hold for the AF scheme that suffers from the mismatch of 
adjacent sub-channels. A concrete example is as follows. 

Example 3.1: In the symmetric two-hop channel with n = 2 (Fig. [3]), the diversity order of 
the AF scheme is 3 while the upper bound is 4. 

Note that the AF channel is in outage if the product channel GH is bad, i.e., all the singular 
values of GH are close to zero. This probability can be decomposed as 

P {GH is bad} = P {both GH and H are bad} + P {GH is bad, while H is not bad} , 
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where we can verify that the first probability is essentially the probability of the sub-channel H 
being bad and that the second one is essentially the probability of GH being bad conditioned 
on the event that H is not bad. As we know, the first probability decays with SNR as SNR~^. To 
find out the SNR exponent of the second probability, we assume without loss of generality that 



the vector hi is strong enough (since H is not bad), as shown in Fig. |3(a)[ Now, we apply an 
orthogonal basis change from the canonical basis to ||/ii|| , | and get the equivalent 

channel in Fig. |3(b)[ The basis change being an unitary transformation that is independent of 
the remaining parts of the channel, it does not affect the statistics of the rest of the channel. As 
shown in Fig. |3(b)[ the channel is bad if the three independent edges crossed by the "minimum 
cut" are bad. The probability for the latter to happen decays as SNR^'^, from which we conclude 
that the outage probability scales as SNR^^ + SNR~'^ = SNR"'^. Therefore, the mismatch between 
G and H is the dominating outage event and the end-to-end diversity of the (2,2,2) channel 
with AF scheme is 3, as compared to 4 given by the cut-set bound. 



IV. Parallel Partition 

The naive AF scheme presented above can be seen as a space-only processing. In the point- 
to-point MIMO channel, it has been shown that space-only coding schemes (e.g., the V-BLAST 
scheme [31]) are suboptimal in diversity. Similarly, the AF scheme, as a space-only relaying 
scheme, does not achieve the maximum diversity in the multihop channel due to the mismatch 
between adjacent sub-channels. The clue is, just like the space-time codes achieve the maximum 
diversity in the point-to-point channel, space-time relay processing should be utilized in order 
to exploit the maximum distributed diversity in the multihop channel. 

The first attempt was made in [20] with a distributed space-time coding scheme. In this scheme, 
each relay performs temporal random unitary transformation on the received signal from the 
source in an independent way. Then, they forward the transformed signal at the same time as 
if they were jointly sending a space-time codeword. The spatial correlation of the codewords is 
due to the fact that the received signal at different relays is from the same source. The temporal 
correlation, on the other hand, is brought in by the temporal transformation. In their setting 
where a single layer of relays and single-antenna terminals are assumed, the maximum diversity 
of the channel is achieved. This scheme is then generalized to the multi-antenna case [21] 
with structured algebraic transformations [32] instead of random transformations. However, 
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generalization of such schemes to the multihop case is difficult and the DMT is hard, if not 
impossible, to calculate. 

In the following, we present a different approach to introduce the temporal processing. This 
approach does not depend on the dimension of the channel and thus suitable for multihop 
channels of arbitrary number of hops. The idea is to partition the relays in each layer. Based on 
the partition, the relays coordinately amplify-and-forward the received signal in a pre-assigned 
mode that changes periodically, which creates a parallel channel in the time domain. Such 
partition is thus called parallel partition. We show that the mismatch is removed in this way 
and the diversity upper bound is achieved. 

In order to describe a parallel partition, some definitions and notations are needed. A supernode 
5 is a set of indices corresponding to a subset of antenna nodes in the same layer. The cardinality 
of S is called the size of the supernode. An edge is defined as the channel between two antenna 
nodes from adjacent layers. An path is defined as a sequence of consecutive supernodes from 
the source to the destination, each supernode performing the AF operation. A parallel partition 
V is defined as a set of AF paths. The number of AF paths in a partition is called the partition 
size and denoted by \V\. An independent parallel partition is defined as a parallel partition where 
any two different AF paths do not share common edges. An independent partition of maximum 
size is called a maximum partition. An independent partition that achieves the maximum diversity 
c^max is called a/w// diversity partition. 

Lemma 4.1: For any fading channel defined hy H, we have 

P{SNR||ff||p < 1} = SNR-'^^o), (24) 

where d{r) is the DMT of the channel. 

Proof: See Appendix IV-AI □ 
Lemma 4.2: Let us consider a set of K independent parallel AF channels 

= IlfcXfc +2fc, k = l,...,K, 

where life's are statistically independent. Then, the diversity order of the parallel channel is the 
sum of the diversity order of the individual AF sub-channels. Furthermore, if all the sub-channels 
have the same DMT do^r), then the DMT of the parallel channel is K do{r). 

Proof: See Appendix IV-B[ □ 



February 1, 2008 



DRAFT 



19 



A. Independent Parallel Partition 

The independent parallel partition is accomplished in two steps : 1) partition each layer into 
supernodes, and 2) find K independent AF paths connecting the supernodes. Each AF path 
defines a relaying mode : only the supernodes in this path are on and perform the AF operation. 
Assume that a data frame of length i^'T is transmitted. Then, the relays change the relaying 
mode every T symbol times. We call it a parallel AF scheme, since the end-to-end channel is 
equivalent to a parallel AF channel in the time domain. Note that the AF scheme is the trivial 



partition of size 1 with a single "wide" AF path. As shown in remark 13.11 the trivial partition 
achieves the maximum diversity only when the wide AF path satisfies the conditions in (|23l) . 
This being impossible in general, the parallel partition aims to find independent "narrow" paths 
each one of which satisfies the conditions in (|23l) . And if the number of independent paths 
is large enough, then the maximum diversity order can be achieved according to lemma 14.21 
Intuitively, the narrower the AF path is, the easier the conditions (l23l) are to be satisfied. In the 
extreme case with the narrowest AF path (1, 1, . . . , 1), all conditions in (l23l) are met. 

Lemma 4.3: In a (no, ni, . . . ,12^) multihop channel, there are exactly dmax independent single- 
antenna AF paths. 

Proof: First, the converse is true, since otherwise, at least two AF paths share the same 
edge in the bottleneck of the channel. Then, the achievability is shown by construction : we 
connect the multihop channel in such a way that 1) there are (imax incoming and outgoing edges 
for each intermediate layer, 2) the number of the incoming and outgoing edges is the same for 
each antenna node (say, in layer i) and can be either [c/max/'^jj or [rfmax/^i]- This partition 
contains rfmax independent (1, 1, . . . , 1) AF paths each one of which has diversity 1. □ 
The lemma implies that the maximum partition is of size rfmax- From Lemma 14.21 and 14.31 the 
following proposition is immediate. 

Proposition 4.1: With the parallel AF scheme, the DMT 



is always achievable in a multihop channel of arbitrary number of hops and antennas. 

Proof: The DMT (l25l) is simply achieved by applying the parallel AF scheme with the 
maximum partition. In this case, (imax single-input-single-output (SISO) parallel sub-channels 



draax (1 ^) 



+ 



(25) 



are generated, from which we have the DMT (l25l) . 



□ 
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While the maximum diversity gain is achieved, this scheme only exploits one out of uq degrees 
of freedom of the channel. This is due to the SISO nature of the AF paths in the maximum 
partition. In order to improve the achievable multiplexing gain, we need parallel partitions with 
wider AF paths. Meanwhile, we still want the maximum diversity, which requires that the AF 
paths should not be too wide. The following theorem states a necessary and sufficient condition 
for an independent parallel partition to achieve the maximum diversity. 

Theorem 4.1: Let the Ui* x rij.+i channel be any bottleneck of the (no, ni, . . . ,12^) multihop 
channel and V be an independent partition of size K. Then, V is a full diversity partition if and 
only if 1) = Ki*Ki*+i with Ki* < Ui* and Kj.+i < n^.+i, and 2) we have 

min Uk^i + 1 > nfc + n^^j.+i, Vfc, (26) 
i<^{i*,i*+i} 

where {rikfi, . . . , n^ tv) is the vector of numbers of antennas of the kih. AF path. 

Proof: To prove the theorem, let us assume there are respectively Ki* and Ki*+i supernodes 
in the layer i* and layer i* + 1, and define K' = Ki*Ki*+i. Then, we must have exactly K(< K') 
connections between the supernodes from these two layers. The diversity of the partition V is 
upper-bounded 

K 

d-p <^nk,i*nk,i*+i (27) 

k=l 
K' 

< (28) 

k=l 

Note that (imax = nj*nj.+i is achieved if and only if both (|27]) and (l28l) have equality. Thus, we 
must have both (|26l) according to the conditions in (l23l) and K = Ki*Ki*+i at the same time. 

□ 

Now, finding full diversity partitions with minimum size is an optimization problem that min- 
imizes the partition size \V\ subject to the constraint that V must be an independent partition 
and satisfy the conditions given by theorem 14711 Unfortunately, it remains an open problem for a 
general multihop channel. The main difficulty lies in the lack of knowledge on the mathematical 
structure of the independent partitions for a general multihop channel. Nevertheless, the problem 
is solved in the two-hop case. 
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(a) parallel partition (b) flip-and-forward 
Fig. 4. Two sets of parallel channels from the (2, 2, 2) multihop channel. 



Proposition 4.2: For a (rio, ni, 77-2) channel, the minimum size of a full diversity partition is 



K 



(29) 



1^0 - + 1 

Proo/.- See Appendix IVI-AI □ 
It is achieved by partitioning the relay layer into K supemodes of size [^J or \^~\ ■ For example, 
the minimum partition size of the (2,4,3) channel is 2 as compared to the maximum partition 
size 8; and each AF path is a (2, 2, 3) channel instead of a (1, 1, 1) channel. Another example 
is the (n, n, n) symmetric channel, where the minimum partition size is n as compared to the 
maximum partition size each AF path is a (n, 1, n) channel. 

Some words regarding the related previous works before proceeding further. In the relay 
channel with direct link and single layer of relays, the A^-relay non- orthogonal AF (NAF) 
scheme [16] divides the data frame into N sub-frames, each one of which is relayed by one and 
only one relay. By creating a parallel NAF channel, this scheme is optimal in diversity. Similar 
thought was shown in [33] in the same channel setting with a different protocol called ND-RAF 
scheme. Removing the direct link from the channel setting, the scheme in [33] becomes the 
parallel AF scheme with the maximum partition in the single-antenna single-layer case. 



B. Flip-and-Forward 

With the parallel AF scheme, the maximum multiplexing gain of the channel is achieved only 
when every AF path in the partition achieves the maximum multiplexing gain rmax = ^q. In 
the following, we propose a scheme that achieves both the maximum diversity gain and the 
maximum multiplexing gain. Let us consider an example first. 
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Example 4.1: The parallel channel {111,112} in Fig. |4(a)| has maximum diversity gain 4 and 
multiplexing gain 1, while the parallel channel {n'^jllg} in Fig. |4(b)| has maximum diversity 
gain 4 and multiplexing gain 2. 

In this example, {111,112} corresponds to the parallel AF scheme based on the full diversity 
partition proposed by Proposition I4.2[ However, it suffers from rate-deficiency, since both sub- 
channels are of rank 1. An alternative is the channel {Il'ijIIg} shown in Fig. |4(b)[ Note that 



n; = H, 



1 



1 
1 



Hence, we have 



n'l n'2 



Hi n2 



n'2 = 

I I 

I -I 





1 

1 
-1 



from which ||n'^||p-|- ||n2||p 



2(||ni||p+ ||n2||p). Therefore, according to lemma W?\\ they both 
achieve the maximum diversity gain 4 except that {111,112} has the maximum multiplexing gain 
2 as well. This scheme is called the Amplify-Flip-and-Forward (AFfJzI scheme, or simply the 
Flip-and-Forward (FF) scheme. The intuition behind the FF scheme is as follows. It has been 
shown that the mismatch between the two hops is the dominating outage event. Now, suppose 
that n'l is bad due to the bad "angle" between Hi and H2 both of which are not bad individually. 
Then, in the second sub-channel, an independent "rotation" matrix diag{l, —1} is used to change 
the angle. With high probability, the new angle is not bad and the mismatch is solved. 

In the light of the example, we generalize the scheme to arbitrary number of antennas and 
hops. Three steps are needed to describe the construction. 

step 1 Find a full diversity independent parallel partition V of size K. The partition defines the 

intermediate supemodes in each layer, 
step 2 We denote the supernodes in layer i by 5j,i, . . . ,Si^Ki with Ki being the number of 

supemodes in layer i. And we define the flip matrices Fj,fc's as rii x rit diagonal matrices 



'The processing matrices ZJi's have been neglected for simphcity of demonstration. 
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multiplexing gain 



Fig. 5. Diversity-multiplexing tradeolf of (2, 2, 2) channel with different schemes. 



with 

-1, if J G Si^k and A; 7^ 1, 
1, otherwise. 

step 3 The FF scheme is composed of K' = Hila^ parallel sub-channels {11).}^ with 

K =HnI[ , (30) 



where fi{k) = {k — 1)ki + 1 and 

k-1 



m 



+ 1, 



2 N-l. 



Ki 



In other words, the set of relays works in K' different flip modes, each one being identified 
by a sequence of flip modes of individual relay layers. And the mapping is effectuated by the 
functions fi{k), f2{k), . . . , f^-iik). The exact DMT of the FF scheme being difficult to obtain, 
we get a lower bound instead. 
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Theorem 4.2: The FF scheme constructed above achieves the following DMT 

d^f'(r)>rf^(r) + (d^,.-d^^(0))(l-irV) + , Mr. (31) 

Proof: See Appendix IVI-BI □ 
We can verify that (i^^(O) = rfmax, that is, the maximum diversity of the channel is achieved. 
Furthermore, the FF scheme is always better than the AF scheme, especially at low multiplexing 
gain. This can be explained by the intuition that the FF scheme solves the mismatch of adjacent 
hops using all possible combinations of flip modes of individual supernodes. The equivalent 
end-to-end channel of the FF scheme can be bad only if at least one of the hops are bad. The 
maximum diversity is thus achieved. Fig. [5] shows the DMT of different schemes in the channel 
of Example 14. 1[ While the AF and the parallel AF schemes achieve respectively the extreme 
of maximum multiplexing gain (2, 0) and the extreme of maximum diversity gain (0, 4), the FF 
scheme achieves both extremes. 

Remark 4.1: The proposed FF scheme is constructed based on the flip matrices that are 
diagonal with ±1 entries. In fact, it can be shown that a looser sufficient condition is for 
the matrices to be 1) diagonal, 2) linearly independent, and 3) of unit absolute value (power 
constraints). Therefore, we can find infinitely many sets of "flip" matrices that satisfy the above 
conditions and they are all diversity optimal. Intuitively, if the matrices are too "close", the FF 
scheme tends to the AF scheme and the promised maximum diversity gain can be achieved only 
when the SNR is very large. This is translated into a poor power gain of the scheme. Hence, 
we should choose the matrices such that they are "far" from each other. In this way, with high 
enough probability, any mismatch can be solved by at least one "rotation" and the maximum 
diversity can be obtained in relatively small SNR. However, what remains open is how to choose 
the distance metric between the rotation matrices. 

C. Non-independent Partition 

With independent partition, the total diversity is the sum of the diversity of each AF path. We 
also established some conditions that independent partitions must satisfy to achieve the maximum 
diversity. In the following, we investigate a particular case of non-independent partition. 

Let us consider a parallel channel defined by {11^.}^ with 

— Hn ■ ■ Hi+iJkHi - ■ Hi, (32) 
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where the selection matrix Jk is a nj x Ui diagonal matrix whose entries are zero except 
that Jk{k,k) = 1. The matrices II^'s are not independent, since they share the common sub- 
channels Ql = Hi_i ■ ■ Hi and Q2 = H ^ ■ ■ However, the RP channels Hi^iJ^Hi's are 
independent for different /c's. Despite the dependency between the sub-channels, we can obtain 
the diversity order of the parallel channel. 

Theorem 4.3: The diversity order of the channel described above is 

niin{<,...,„,)(0),44,..,„^)(0)}. (33) 

Proof: We use the DMT interpretation given in Section lin-B3l to sketch the proof. One 
possibility for the parallel channel {11^}^ to be in outage is that one of Qi and ^2 is bad. The 
diversity is either d^^^ ^ ^^(0) or d^^_^^ njv)(0)- Another possibility is that both and Q2 
are good and that {Ilfcjfc turns out to be bad. Without loss of generality, we assume that the 
flow from the source to layer i — 1 is ki and that from layer z + 1 to the destination is k2. And 
we call the outage event a type-{ki, k2) event. Then, it can be shown that {11^}^ is equivalent 
to {H[^iJkH[}k with H'- G C""^'=i and if -^^ G C'^^xn, ^^^^^^ Gaussian matrices with i.i.d. 
CJ\f{0, 1) entries. Now, we must disconnect all the sub-channels in {H'^j^^J kH'j} k, which costs 
Hi min{/ci, ^2}- Therefore, the total cost for the type-(A;i, k2) event is 

C^fno,...,n,-i)(^l) + min{A;i, ^2} + 4n>+i,...,niv)(^2). 

The typical outage event is the one that minimizes the above total cost. For k2 > ki, using (|20l) . 
we can show that the minimum total cost is df-^ . (0) . Similarly, df-^ ^ (0) is the minimum 

total cost for k\ > k2. Since both costs are smaller than ^ ^^(0) and d^^_^^ um)^^^ 

the mono tonicity (Corollary 13.21) . we proved the theorem. □ 

Note that with this particular partition at layer i, we achieve a diversity order as if layer i were 
clustered and the cooperative DF scheme were used. This result implies that one might achieve 
the maximum diversity with a partition of small size. For example, the maximum diversity order 
of the (3, 2, 2, 2, 3) channel is 4 and all the full diversity independent partitions are of size K = 8, 
i.e., eight (3, 1, 1, 1, 3) sub-channels. With the non-independent partition described above, we get 
a couple of (3, 2, 1, 2, 3) sub-channels, i.e., size 2. Since d^2 2 3)(0) = 4, the maximum diversity 
4 is achieved as well according to Theorem 14.31 

We can apply the FF scheme to the case of non-independent partition. Then, in this example, 
the parallel channel is {11'^}^ with H'^ = Hn ■ ■ ■ Hi+iFkHi ■ ■ Hi where the flip matrix Fk is a 
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rii X rii diagonal matrix whose entries are one except that F^ik, k) = —1 if k ^ 1. The channels 
{n'j.}fe being a linear invertible transformation of {U.k}k, the generalized FF scheme achieves 
the diversity given by (|33l) . 

D. Extensions 

With the nice parallel-channel structure, the FF scheme can be extended to various cases. 

Let us first consider the extension to the MIMO relay channel with direct link and a single 
layer of relays. By applying directly the single- antenna NAF scheme [16] to the multi-antenna 
case, the source cooperates with one relay at a time. This is equivalent to using the parallel AF 
scheme in the source-relays-destination link. The DMT lower bound is obtained in [34] as 

dp{r) + Nd^l^^^^pr), (34) 

where dF{r) is the DMT of the rit x source-destination channel F and each relay has n 
antennas. In fact, this lower bound can be improved to 

t^F(0 + 4n.,iVn,n,.)(2r), (35) 

by replacing the parallel AF scheme in the source-relays-destination link with the FF scheme. 
Comparing the second terms from (|34l) and (l35l) . the gain in diversity of the new scheme over 
the MIMO NAF is reflected by 

A^<,n,n,) (0) < iV n min{nt, n,} (36) 

where the inequality (l36l) becomes strict when n is large. The gain in multiplexing of the source- 
relays-destination link is obvious when n is small, i.e., n < min{nt,nr}. In this case, the FF 
scheme pools the relay antennas together to provide more degrees of freedom. 

Another extension is to the multiuser case. Let us take the multiple access channel as an 
example. For simplicity, we assume that M users try to communicate with the common destina- 
tion through the same layers of relays. Then, we the FF scheme, we have an equivalent parallel 
multiple access channel with 

M 

yf, = ^Tlk,iXi + Zk, k = l,...,K', (37) 

i=l 
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where {11^ j}^ is similarly defined as in (|30l) with 

Ili^fc = l/'ArF7V-l,/jv-i(fc)-^Af-l ■ ■ ■-^^2-f'l,/i(fc)-H'l,fe- (38) 

Note that only the first hop is distinct for different users. Using the techniques of [24] and the 
our results for the single-user FF scheme, it is possible to analyze the DMT of the FF scheme 
in the multiple access channel. It is trivial to show that similar extension also holds for the 
broadcast channels with minor modifications. 

V. The Clustered Case Revisited 

In Section III-D[ it has been shown that the cooperative DF scheme achieves the DMT cut-set 
bound in the clustered case. In this section, we would like to study some alternative schemes, 
since it might be impossible or unnecessary for all the clusters to decode the source message in 
some cases. 



A. Serial Partition 

The AF and the cooperative DF schemes can in fact be seen as two extremes of what we call 
the serial partition of multihop channels, defined as follows. 

Definition 5.1: A serial partition is defined by a set of layer indices V = {Vi, • • • , T^\d\} 
with < Pi < P2 < • • • T^\v\-i < T^\v\ — N, each layer performing cooperative decoding-and- 
forward operation. 

With a serial partition, the multihop channel becomes a serial concatenation of |P| AF channels. 
As in dH), the DMT of the multihop channel with any partition V is easily derived as 

where we defined — 0. To get the maximum diversity gain, the question of when to decode 
has been answered earlier : when the conditions in (l23l) are not met. Another question is where 
to decode, i.e., how to find the partition of minimum size that achieves a given diversity order. 

Proposition 5.1: Let us take = Q and we succesively decide T>i as the maximum integer 
in {Vi_i,N] such that 

<V,,...,n.j(0)>c^. (40) 
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Then, the decoding set {Vi} defines the partition of minimum size that achieves a given diversity 

d (< Cimax)- 

Proof: From (|39l) . it is easy to show that the proposed partition achieves diversity d. Now, 
we would like to show that the size of the proposed partition is minimized. To this end, it 
is enough to show that for any set V of decoding points that achieves diversity d, we have 
T^i < T^i, Vi. This is obviously true for V[, since the diversity of the AF channel degrades 
with the number of hops. By induction on i, it is shown that V^^-^ < Pj+i because otherwise 
{rij)., . . . , rixj.^J C (^x)^, . . . , "^©^^J and the corresponding diversity of the AF scheme cannot 
be larger than d according to the monotonicity of the DMT (Corollary 13.21) . □ 
The proposition matches the intuition that we should only decode when we have to, in the 
diversity sense. In other words, we allow for the degradation of diversity introduced by the AF 
operation, as long as the resulting diversity is larger than the target d. 

B. CSI Aided Linear Processing 

Another option is to linear process the received signal at each cluster without decoding it. 
Unlike the AF scheme in the non-clustered case, where trivial antenna-wise normalization is 
performed, we can run inter-antenna processing based on the available CSI at the cluster. With 
receiver CSI at the relays, let us consider the following project-and-forward (PF) scheme. At 
layer i, the received signal is first projected to the signal subspace spanned by the columns of the 
channel matrix H_^. The dimension of the subspace is r^, the rank of H^^. After the component- 
wise normalization, the projected signal is transmitted using (out of ni) antennas. It is now 
clear that H_^j^^ E C"'+^^'^* is actually composed of the columns of the previously defined 
ffj+i, with tq = riQ. More precisely, the Q. G C"^^''" is an orthogonal basis with Q^jQ. = I- We 
can rewrite 

= QG, 

with G C'^'^''' ^ For simplicity, we let be obtained by the QR decomposition [35] of 
if rii > rj__i and be identity matrix if < r^. The spirit of the PF scheme is not to use 
more antennas than necessary to forward the signal. Since the useful signal lies only in the 
Tj-dimensional signal subspace, the projection of the received signal provides sufficient statistics 
and reduces the noise power by a factor ^. In this case, only antennas are needed to forward 
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the projected signal. Let us define Pj = Then, as in the AF case, the PF multihop channel 

is equivalent to the channel defined by 

IIpF = H_j^Pn~i ■ ■ ■ H_2PiH_^. 

The following proposition states that receiver CSI and inter-antenna processing do not improve 
the DMT of the AF scheme. 

Proposition 5.2: The PF scheme is equivalent to the AF scheme. 

Proof: See Appendix IVTCl □ 
While the PF and AF have the same DMT, the PF outperforms the AF in power gain for two 
reasons. One reason is, as stated before, that the projection reduces the average noise power. 
The other reason is that the accumulated noise in the AF case is more substantial than that in 
the PF case. This is because in the PF case, less relay antennas are used than in the AF case. 
Since the power of independent noises from different transmit antennas add up at the receiver 
side, the accumulated noise in the AF case "enjoys" a larger "transmit diversity order" than in 
the PF case. 

On the other hand, if we could have receiver and transmitter CSI at the clusters, the DMT 
could be improved as shown by the following example. 

Example 5.1: For a (n,n, . . . ,n) clustered multihop channel, the DMT cut-set bound can be 
achieved by linear processing within clusters if both transmitter and receiver CSI are available 
at each cluster. 

The optimum linear relaying scheme is defined by the processing matrices Tj's with Tj = Vl_^_iU i 
where we assume that Hi = t/^EjVj is the singular value decomposition of Hi. The diagonal 
elements in the singular value matrix Sj are in increasing order. This scheme matches the adjacent 
hops by aligning the singular values in the same order. It is then equivalent to the channel defined 
by Yli whose DMT can be showiu to be as the n x n Rayleigh channel. 

VI. Codes Construction 

Now, we need codes that actually attain the DMT promised by the studied relaying strategies. 
To this end, the construction of Perfect STBCs [26], [27] for MIMO channels is extended to the 
multihop relay channels. The constructed codes are approximately universal [28]. 

^The proof, that is essentially as the proof in [23], is omitted here. 
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A. The Clustered Case 

The relay clusters that perform the cooperative DF operation partition the multihop channel 
into a series of \V\ MIMO channels, say, Hi,H2, • • • ,H\t:\ with Hi G C"^i''"^'i-i . An obvious 
coding scheme that achieves the DMT is described as follows. Let r be the target multiplexing 
gain. First, the source terminal encodes the message of TrlogSNR bits with a uq x T Perfect 
STBC Ao(r). Then, in a successive manner, layer Vi tries to decode the message. When a success 
decoding is assumed, the TrlogSNR bits are encoded with a nx>i x T Perfect STBC A'j(r) and 
forwarded. We can show that as long as T > T^in with 

Tinin = max nn,_i, 

i=l,...,\V\ 

the series of Perfect STBCs can be found [27]. With the union bound, the end-to-end 

error probability is upper-bounded 

\v\ 

Pe(r, SNR) < ^pW(r,SNR), (41) 

1=1 

where Pi*'' is the error probability of Xi{r) in the MIMO sub-channel Hi. Since Xi{r) is DMT- 
achieving for any fading statistics, we have 

Pi*)(^,SNR) = SNR (42) 

From gB and (|42l), the DMT 021) is achieved with coding delay T^in- Since the Perfect STBCs 
are approximately universal [28], so is this coding scheme. Note that this scheme can be used 
for the AF and PF schemes with \V\ = 1. 

B. The Non-Clustered Case 

In the non-clustered case, the parallel AF and the FF schemes are used. Note that both schemes 
share the common parallel MIMO channel structure 

= Ilfcorfc + Zfc, k=l,...,K, (43) 

where Ilfc G C'^'''=^"' '= and K is the number of the parallel sub-channels. Let X he a code for 
the parallel channel. A codeword is defined by a set of matrices {Xk}^=i with X^. G C"' *^^^. 
We define a parallel STBC with non-vanishing determinant (NVD) as follows. 
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Definition 6.1: Let B be an alphabet that is scalably dense, i.e., for < a < 1, 

|i3(SNR)| = SNR^ and 
s e BiSm) \sf <SNR^ 

Then, a parallel STBC X is called a parallel NVD code if it 

1) is iS-lineaiO; 

2) has full symbol rate, i.e., it transmits on average ^^nt^fc symbols per channel use from the 
signal constellation B; 

3) has the NVD property, i.e., for any pair of different codewords {X^jfe, {Xk}k ^ 

J] det ((X, - Xk){Xk - iTfc)') > K > 0, (44) 

k 

with K a constant independent of the SNR. 
We have the following result. 

Theorem 6.1: The parallel NVD codes are approximately universal over the parallel channel 
defined by (l43l) . 

Proof: See Appendix IVI-DI □ 
Thus, to achieve the DMT of the parallel AF and the FF schemes, it is enough to construct a 
parallel NVD codes. Several remarks are made before proceeding to the code construction. 

Remark 6.1: The actual data rate of the NVD codes is controlled by the size of the alphabet 
B and the symbol rate. Efficient decoding schemes (e.g., sphere decoding) may not be imple- 
mentable when the channel is under-determined or, alternatively speaking, rank-deficient in the 
sense that ^^rank(nfc) < Xlfc'^t.fc- Practical schemes include reducing the symbol rate while 
increasing the size of the alphabet B. This, however, does not guarantee the DMT-achievability. 

Remark 6.2: Explicit parallel NVD codes for asymmetric parallel channel (i.e., Ut^i ^ Uij 
for some i ^ j) being hard to construct algebraically, we focus on the symmetric case. Note 
that in the FF scheme, the equivalent parallel channel is always symmetric. In the parallel AF 
scheme, the numbers of transmit antennas of different sub-channels may be different. However, 
the problem can be overcome by using the same number of antennas (i.e., max^nt fe). The 
resulting parallel channel has at least the same DMT as the original channel. Nevertheless, an 

'Af is S-linear means that each entry of any codeword in A" is a linear combination of symbols from B. 
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alternative code construction that is suitable for both symmetric and asymmetric parallel channels 
is provided in Appendix IVI-EI for completeness. 

Remark 6.3: From a given parallel partition with size S, the number of the parallel sub- 
channels is S* in the parallel AF scheme, generally larger than S in the FF scheme. Since the 
minimum coding delay is K max^ rij ^ that grows linearly with K, it grows at least linearly with 
S. Moreover, the complexity of decoding can grow up to exponentially with K if ML decoding 
is used. That is why it is important to find partitions of small size S. 



C. Algebraic Construction of Parallel NVD Codes 

A systematic way to construct NVD codes is the construction from cyclic division alge- 
bra (CDA). For more details on the concept, the readers can refer to [36]. In the following, we 
aim to construct the Perfect symmetric parallel NVD codes with quadrature amplitude modula- 
tion (QAM) constellations 1*] The generalization to hexagonal constellations is straightforward. 

1) K = 1: We start by the construction of NVD codes for MIMO channels (K = 1). Let 



L 



be a cyclic extension of degree rii on the base field 



We denote a the generator 



of the Galois group Gal(L/Q(i)). Let 7 G Q(i) be such that 7, 7^, . . . , 7"'"^ are non-norm 
elements in L. Then, we can construct a CDA A = (L/Q(i), cr, 7) of degree rif Each element 
in A has the following matrix representation 

/ Xn Xi ... X„.-^ \ 



Xq Xi ... Xfi^—i 

70"(a;n,-i) cr{xo) ... a (a;„,„2) 



"'-^(a;i) 7a"'~i(a;2) 



rit— 1 



(45) 



where Xi E Oj^, Vi. Since ^ is a CDA, we can show that detH G Z[z] and that the determinant 
is zero only when H is a zero matrix. Thus, the NVD property is proved by considering that the 
difference matrix of each pair of codewords is in the form of H. 

It is usually desirable to get a STBC with good shaping. To this end, we can impose the 
additional constraint that the vectorized codeword is a rotated version of a QAM^"'^ constellation, 
as known as the cubic constellation. Rotated constellations constructions from algebraic number 
fields are well-known now (see, e.g., [38] for a comprehensive tutorial on this topic). This can 



The construction was first reported in [37] and is included for sake of completeness. 
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K 4 ¥{e) 




Qii) 



Fig. 6. Field extension tower. 

be made possible if 1) Xi's in the matrix E! belong to some properly chosen ideal I C Oj^ [39], 
and 2) I7I = 1 (see [27] for a general method). The thus-constructed NVD codes are well-known 
as the Perfect STBCs. 

2) K > 1: The construction of parallel NVD codes is similar to the construction presented 
above. First, we construct a CDA in the same manner as the previous case by simply 1) replacing 
the base field Q(i) by a new field F, a Galois extension of degree K over Q{i); 2) replacing 
the field L by K = F(^^), a cyclic extension of degree Ui over F (same 9 as the previous 
case); and 3) choosing 7 such that 7,7^, . . . ,7"'^^ are non-norm elements in K. We impose 
that F n L = Q{i). Note that the extension K/F remains cyclic with the same Galois group as 
Gal(L/Q(i)) (Fig.[6l). Thus, the constructed CDA is A(K/¥, a, 7). Now, let {ri, rg, . . . , tk} be 
the Galois group of the extension F/Q(z) and define 

Ek = Tk{E), k=l,...,K, 

where H is the matrix representation of some element in A and is in the form (|45]) . Now, we 
have 

Y\ det Hfc = J]^ Tfc (det E) 

k k 

= N^/Q(i) (detH) 

that is in Z[i]. Finally, we construct codewords {Xk}k in the form of {'Ek}k with QAM symbols 
and we can show that the difference matrix of a pair of different codewords is also in the 
form of {Ek}k with symbols in Z[i]. The NVD condition (|44|) is thus met. Similarly, the cubic 
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(46) 



shaping can be obtained with the same kind of conditions mentioned before. An explicit code 
construction is provided in the following example. 

Example 6.1 (Two transmit antennas and K = 2™ sub-channels): Let us define ('2™+! — e^*'^^" 
Then, we consider the base field F = Q(C2"+2). an extension of Q{i) of degree 2™ and 
take K = ¥{V5) = Q (C2-+2, Vs). We can verify that 7 = C 

is a non-norm element in 

K (see Appendix NT-F\i . Let 6 = and a : 6 ^ 6 = The ring of integers of K 

is Ok = {a + b6 \ a,b e Z[C2™+2]}. And the chosen ideal is principle, i.e., J = {a)OK with 
a = 1 + i — iO. The matrix E! is given by 

a ■ {a + b9) a ■ (c + d9) 
70; ■ (c + d9) a ■ (a + b9) 

where a,b,c,d E Z[(2^+2]. We can show that the shaping property is satisfied and finally, this 
code is a perfect STBC for the parallel channel. 

VIL Numerical Examples 

In this section, we present the numerical results on the proposed schemes. The performance 
measures are either the outage probability or the symbol error rate probability versus the average 
received SNR per bit. The results are obtained with Monte-Carlo simulations. 

The first example is to illustrate the impact of vertical reduction of multihop channels, as 
shown in Fig. |8(a)[ In a (1,4, 1) channel, the necessary antenna number n from (fTSi) is 1 and 



the minimal vertical form is thus (1, 1, 1). We observe that, with the same diversity order 1, an 
asymptotic power gain of 7 dB is obtained by using only one relay antennas out of four, if the 
AF scheme is used. The gain is due to the fact that using more relaying antennas hardens of 
relayed noise. In the (3, 1, 4, 2) channel, the necessary number of antennas n from (fTSi) is 2. As 
shown in Fig. |8(a)[ by restricting the number of relay antennas to 2, we have a (3, 1, 2, 2) channel 
and an asymptotic power gain of 2 dB is observed. We can further reduce the number of transmit 
antennas to 2 to get a (2, 1, 2, 2) channel. Unlike the reduction of relay antennas, the reduction 
of transmit antennas does not provide any gain because it does not impact the relayed noise. In 
contrast, it degrades the performance since the first hop (2, 1) is faded more seriously than the 
original first hop (3,1). Nevertheless, the (2,1,2,2) channel is still better than the (3,1,4,2) 
channel and is only 0.7 dB from the (3, 1, 2, 2) channel. The coded performance of the (3, 1, 4, 2) 
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channel is then studied Fig. |8(b)[ The diagonal algebraic space-time (DAST) code0 [40] can be 
used. As shown in Fig. |8(b)[ with the DAST code, the symbol error rate performances of in 
the (3, 1, 4, 2), (3, 1, 2, 2) and (2, 1, 2, 2) channels have exactly the same behavior as the outage 



performances of the channels do Fig. |8(a)[ Moreover, the reduction in the number of transmit 
antenna allows us to use the Alamouti code [41] (the (2, 1, 2, 2) channel). As we can see in the 
figure, the Alamouti code, besides the advantage of lower decoding complexity, outperforms all 
the DAST codes. The potential benefits from the vertical reduction are thus highlighted. 

Then, we consider the parallel partition of two multihop channels : the (2, 2, 2, 2) and (2, 4, 3) 
channels. The resulting AFF scheme is compared to the AF scheme in terms of both the outage 
probability and the symbol error rate. With the AFF scheme, we create respectively four and 
two parallel sub-channels with two transmit antennas for the (2,2,2,2) and (2,4,3) channels. 
Specifically, the AFF scheme for the (2, 2, 2, 2) channel is based on a partition of four (2, 1, 1, 2) 
sub-channels and for the (2,4,3) channel is a partition of two (2,2,3) sub-channels. As shown 
in Fig. |9(a)[ the diversity order of the AFF scheme for the (2,2,2,2) (respectively, (2,4,3) 
channel) is 4 (respectively, 8), as compared to that of the AF scheme (3 and 6, respectively). 
The coded performance is also studied. We apply the construction provided by Example 16.11 to 
get Perfect parallel STBCs for two and four sub-channels. As we can observe in Fig. |9(b)[ with 
the use of Perfect codes, the symbol error rate performance has similar behaviors as the outage 
performance. 

The last example is a (3, 1, 4, 2) channel in the clustered case. Through this example, we would 
like to address the impact of "where to decode" on the end-to-end performance. The all-AF and 
all-DF schemes correspond respectively to the case with no decoding relay cluster and that with 
two decoding relay clusters. With one decoding cluster, the choice is made between decoding 
at the first cluster and decoding at the second one. As shown in Fig.[lOl the all-AF scheme has 
diversity order two and the all-DF scheme has diversity order 3 as analytically expected. With 
only one decoding cluster, the diversity order is also predictable : diversity two in the single- 
antenna cluster and diversity 3 in the four-antenna cluster. What is impressive in this example 
is that the two curves with different choices of decoding cluster joins the all-AF and all-DF 
curves respectively at high SNR. Therefore, only one decoding cluster is enough to achieve 

"Note that the DAST code is the diagonal version of the rate-one Perfect code proposed in [26]. 
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good performance in this case. And the decoding cluster should not be the single-antenna node. 

VIII. Conclusion 

The diversity of MIMO multihop relay channels has been investigated in both the clustered 
and non-clustered cases. Our results showed that, in both cases, the maximum diversity gain 
and the maximum multiplexing gain of the multihop channel can be achieved. In the clustered 
case, the optimal scheme is cooperative decode-and-forward that achieves the upper bound on 
the diversity-multiplexing tradeoff of the channel. In the non-clustered case, the key to achieve 
the maximum diversity is space-time relay processing. Our approach is to introduce temporal 
processing to the amplify-and-forward scheme by creating a parallel channel in the time domain. 
We proposed a flip-and-forward that achieves both the maximum diversity and multiplexing gain 
of an arbitrary multihop channel in a distributed manner. We also showed that the FF scheme 
can be easily extended to the multiuser case. With its low relaying and signaling complexity, the 
FF scheme is suitable for wireless ad hoc networks. Approximately universal coding schemes 
have been proposed for all the relaying strategies studied in this work. 
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Appendix I 
Preliminaries 

The foUowings are some preliminary results that are essential to the proofs. 

Lemma Al.l (Calculation of diversity -multiplexing tradeoff): Consider a linear fading Gaus- 
sian channel defined by H for which det (I + SHRHW)) is a function of A, a vector of positive 
random variables. Then, the DMT (ij^(r) of this channel can be calculated as 

dRir) = inf Eia.) 

0{a,r) 

where ai = — logtij/ log SNR is the exponent of Vi, 0{cy.,r) is the outage event set in terms of 
ct and r in the high SNR regime, and E(o!.) is the exponential order of the pdf p{a.), i.e.. 

Proof: This lemma can be justified by ^ using Laplace's method, as shown in [23]. □ 
Definition Al.l (Wishart Matrix): The mxm random matrix W = HW is a (central) complex 

Wishart matrix with n degrees of freedom and covariance matrix R (denoted as ~ Wm(?^, R)), 

if the columns of the m x n matrix H are zero-mean independent complex Gaussian vectors 

with covariance matrix .R. 

Lemma Al.l: The joint pdf of the eigenvalues oiW = HW ~ >Vm(^,-Rmxm) is identical to 

that of any W ~ Wm'(n, diag(/ii, • • • , yUm')) if /^i > • • • > A^m' > A^m'+i = . . . = yUm = are 

the eigenvalues of Rmxm- 

Proof: Apply the eigenvalue decomposition on R and the result is immediate using the 

unitary invariance property [42] of Wishart matrices. □ 
Lemma A1.3 ([43]-[46]): Let W" be a central complex Wishart matrix W ~ yVm{n, R), where 

the eigenvalues of R are distinct and their ordered values are ni > ... > fXm > 0. Let 

Ai > . . . > Ag > be the ordered positive eigenvalues of W with q = min{m, n}. The joint 

'"In the particular case where some eigenvalues of R are identical, we apply the I'Hospital rule to the pdf obtained, as shown 
in [45], 
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pdf of A conditioned on /j, is 



m m , , 

X^,„Det(fii) n ""'Ar"^ n if n>m. 



p(A|/u) = < 



i<3 



(47a) 



m ^ n 



m, (47b) 

i<j ^' " ' ■" i<j 

where -f^rn.n and Gm,n are normalization factors; Det(-) denotes the absolute value of the 
determinant det(-); fli = [^~^^^^']^=i 



1 Hi 



,m— n— 1 ,,m— n— 1, 



,m— n— 1 



e Mm 



(48) 



1f> I I II'' 't- -L I I I 'I' 't- -L 11" 

■■■ ^ra ^ ' ' ' 

In the non-correlated case with i2 = I, the joint pdf is 

i=l i<j 

Now, let us define the eigen-exponents 

cti = -log Aj/logSNR, i = l, and /3i = -log/Lii/logSNR, i = l,...,m. 

Lemma A1.4: 



(49) 



Det(fii) = 



.SNR~°°, otherwise. 



where 



and 



7^(^) = {ai < . . . < a^, A < . . . < /5m, and /5j < a^, for i = 1, . . . , m} . 

Proof: Please refer to [34] for details. 
Lemma AL5: 



Det (Q2) 



rSNR-^"2(«./3), for (a,/3) e 7^(2) 
lsNR-°°, otherwise. 



(50) 
(51) 

(52) 

□ 

(53) 



where 



£;n,(a,/3)^^(m-n-l)A+ ^ (m-i)/3,+^ ^ («, - /3,.)^+ ^ ^(c,-/3,.)^ (54) 

i=n+l j=l i<j j=n+l i=l 



i=l 
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and 



n^'^) = {«!<...< A < . . . < /5m, and A < a^, for z = 1, . . . , n} . 
Proof: First, we have 



(55) 



Det(Q2) = n'"r""~'Det 



i=l 



-(m— n— 1) 



-(m— n— 1) 



-\n/^^l 



/im^ ' ■ ■ ■ 1 e^^i/'''" ■ ■ ■ 
Then, let us denote the determinant in the RHS of (l56l) as and we rewrite it as 

^(m-n-l) _ _ _ g g-Ai//ii _ g-Ai/^tm . . . g-A,i//ii _ g-A„/^t„ 



(56) 



D = Det 



,(m-n-l) 
m— l,m 

(m— n— 1) 



Q g — Al//im-l Ai//im . . . g — A„/^lm-l 



-An/M 



= Det 



(m— n— 1) 
m 



(1) 



(m— n— 1) 
m— l,m 



e 



-\l/^ir, 



'Ai/^ti 



— Ai//^,„_i 



-An//^ 



(57) 



-Xn/fll 



-Xn/ fJ-n 



-^A,;//t, 



') (58) 



i=l 



where rfj = ^.-'^ — ii- and the product term in (|58l) is obtained since 1 — e 
1 — e^'^'/^'" for all j < m. Let us denote the determinant in (l58l) as Then, by multiplying 



the first column in with /i™ " and noting that /i™ " d^- 



1 - if^m/t^i) 



m— n— 1 



1, 



the first column of D„i becomes all 1. Now, by eliminating the first m — 2 "l"s of the first 
column by subtracting all rows by the last row as in (l57l) and (|58l) . we have yU^~"~^Dm = 
YYi=i (l ~ e^'*''/^™) Drn-i- By continuing reducing the dimension, we get 



71+1 



m— I 



j=n+2 



n m 



n n (1 



i=\ j=n+l 

from which we prove the lemma, by applying (l50l) . □ 
With the preceding lemmas, we have the following lemma that provides the asymptotical pdf 

of OL conditioned on /3 in the high SNR regime. 
Lemma A1.6: 

'SNR-^("I^\ for {a, /3) e nc.lf3. 



p{a\(3) 



(59) 



.SNR" 



otherwise. 
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where 

<? <? 1 rn q 

1=1 i=l j=l i<j i=9+l «=1 

and 

^ct|/3 = {«i < • • • < ftg, A < . . . < An, and A <a„ for i = 1, . . . , g} . (61) 

Proof: Let us replace Det(Qi) and Det(Q2) in (147 al) and (I47bl) using the results of Lemma lA1.4l 
and Lemma |A1.5[ Then, by applying variable changes as done in [23], (|60l) can be obtained 
after some elementary manipulations. □ 
When R = I, i.e., /ii = . . . = fim = 1, the joint pdf of a. is found in [23] as shown in the 
following lemma. 
Lemma A 1.7: 

p(a) = <^ (62) 
[SNR"°°, otherwise, 

with 7^c = {0 < ai < . . . < aj. 

This lemma can be justified either by using (|49l) or by setting A = 0, Vi in (|60l) . 

Lemma AL8 ( [47]): Let M be any mx n random matrix and T be any mx m non- singular 
matrix whose singular values satisfy a^amiT) = crinax(T") = SNR°. Define q = min{m, n} and 
M = TM. Let o-i(M) > . . . > o-g(M) and ai(M) > . . . > (XqlM) be the ordered singular 
values of M and M, Then, we have 

Appendix II 
Proof of Theorem 13.11 

Proposition A2.1: Let us denote the non-zero ordered eigenvalues of 1111^ by Ai > ■ ■ ■ > 
•^nmin > "^'ith rimin = min nj. Then, the joint pdf of the eigen-exponents ex satisfies 

i=0,...,N 

rsNR-^{"), forO<ai<...<a„^,„, 
p(a) = \ (63) 
lSNR-°°, otherwise. 



where 



n„ 



E{cx) ^ Yl (64) 



j=i 
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with Cj's defined by (fT4l) . 

From Lemma lAl.ll we can derive the DMT with the following optimization problem 

d(r) = min > q 

i 

with OQ{r) = < ''^j being the outage region. Note that Cj is decreasing and ai is 

increasing with respect to i. Then, the proof of Theorem 13.11 is immediate. 

Now, what remains is the proof of Proposition IA2.1I The following lemma will be needed in 
the proof. 

Lemma A2.1: Let = [pk,Pk-i], k = 1, . . . , N, he N consecutively joint intervals with 



Pn = -oo, po = no, and 



k 

Pk = y^^ni- khk+i k = l,...,N-l. (65) 

1=0 



Then, we have 

Ek ~ 
1=0 - 



1-1 + 



for i elk- (66) 



k 

Proof: Ci defined by (fT4)) is the minimum of N sequences corresponding to the values 
of k. It is enough to show that each of the sequences dominates in a consecutive manner. We 
omit the details here. □ 



A. Sketch of the Proof of Proposition \A2.1\ 

The proof will be by induction on A^. From lemma lA1.7[ the proposition is trivial for A^ = 1. 
Suppose the proposition holds for some A^ and 11 = Hi ■ -Hn, we would like to show that 
it is also true for A^ + 1 and 11' = Hi ■ ■ ■ Hjy^i. For simplicity, the "primed" notations (e.g., 
a', n', h', c', ri'jjjjjj, etc.) will be used for the respective parameters of 11'. Note that 11' (11')^ ~ 
VV„o(nAr+i,nn^) for a given 11, since 11' = HH^+i. According to lemma lAl.21 the pdf of the 
eigenvalues A' of 11' (11')^ is identical to that of >V„j^;j^(?2Ar+i, diag(A)). Hence, the pdf of a' can 
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be obtained as the marginal pdf of {a', a) 

p{a') = / p{a.',a)dcx 



p{a' \ ct)p{a)da 

din 

= / SNR-^("'l")SNR-^(")da (67) 
Jn 

= SNR-^("'\ (68) 
where (l67l) comes from lemma lA 1.61 and our assumption that (l63l) holds for N, with 



= |o <«;<... < a^j^,^, < ai < . . . < and < a^, for ? = 1, . . . , n'^-^^ (69) 

being the feasible region; the exponent E{ct') in (|68l) is 

^(a) = minE(a',a) (70) 
with a) = ^(a'|Q:) + E{a). From ([601) and dMl), 



E(a', a) = ^(nTv+i - i + l)a- + I ( j - 1 - riN+i + Cj)aj + ^ (a- 



It remains to show that E{cx') = E'{cx') = J2i Ci^'i with 

Ek ~ I 
1=0^1-'^ 



/ A -I ■ I 

c, = 1 — t + mm 

k=l,...,N+l 



, i = l,...,n^„ (72) 



by solving the optimization problem (|70l) . which is accomplished in the rest of the section. 

B. Solving the Optimization Problem 

We need to distinguish three cases, according to how the value of un^i affects the ordered 
dimension h'. 
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S(»„,n) 




1 2 1 2 (Jmin 1 2 / 

(a) Case 1 (b) Case 2 (c) Case 3 

Fig. 7. For each j, the black dots represent the a''s that are freed by aj. Therefore, we can get the total number of freed 
a'i by counting the black dots in row i. More precisely, there are [g~^(i)J — [/~^(*)] + 1 = Lff^^(i)J ~ i black dots for 
i < g(n,nin), and Umin ~ + 1 = '^min - i black dots for i > giumin)- 



1) Case 1 [riN^i < ho]: In this case, we have n'^^^ = Uq = un+i. Minimization of 
E{a.,(y.') of (TTTI) with respect to a can be decomposed into n^^^ minimizations with respect 
to ai, . . . , an^in successively, i.e., minQ, = min^^^.^^ ■ ■ ■ min^i. We start with ai. From (|6T| ). the 
feasible region of ai is < ai < a'^. Since the only «i-related term in (TtTI) is (ci — nM+i)(^i 
and ci — riM+i > for tiat+i < ho, we have a* = 0. Now, suppose that the minimization with 
respect to ai, . . . , aj^i is done and that we would like to minimize with respect to aj. For aj, 
j < '^miii' we set the initial region as 

< a'l < ■ ■ ■ < a'j-i < aj < a'j 

in which we have J2i<j («i — = 0. The feasibility conditions in (|69l ) require that aj must 
not go right across a'y The only choice is therefore to go to the left. Each time aj goes across 
an a'^ from the right to the left, (a'- — aj)^ increases by a'^ — aj, which increases the coefficient 
of a[ by 1 and decreases the coefficient of aj by 1. It can be shown that, to minimize the value 
of E{a,a') with respect to aj, aj is allowed to cross a- only when the current coefficient of 
aj in (TtTI) is positive!^ So, aj stops moving only in the following two cases : 1) it hits the 
left extreme, 0; and 2) its coefficient achieves when it is in the interval [a'^, a[^j\ for some 
k < j. Either case, -related terms are gone and what remain are the a-'s "freed" by aj from 
^i<j ^'^i ~ '^i) • Same reasoning applies to aj for j > n'^-^, except that the initial region is set 

'^When the coefficient of in ( |71b is positive, decreasing a.; decreases E{ct,ct'). 
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to < a'l < • • ■ < <aj. 

mill 

Therefore, the optimization problem can be solved by counting the total number of freed a'/s. 
As shown in Fig. |7(a)[ when j is small, the initial coefficient of aj is large and thus aj can 
free out a^.i, ■ ■ ■ ,a[. We have a* = 0, which corresponds to the first stopping condition. For 
large j, the initial coefficient of aj is not large enough and only 



a 



is freed, which 



corresponds to the second stopping condition. With the above reasoning, we can get g(j) 

'j - 1 - (j - 1 - nN+i + Cj) + 1, for J < n'j^in, 

for J > 



(73) 



nN+i - Cj + 1, 



From dVB]) and ([HI), we get 

9{j) = nN+i 

and 



mm 

k=l,...,N 



E!=o^' - {k + l)j 



n 



mm 

k=l,...,N 



k+ 1 



(74) 



(75) 



Now, E{a') can be obtaineco from Fig. |7(a) 



E(a') = £, 



i=l 

3("miii) 



9(nmin) 

i=l 



i)a' + 



% I a, 



«=3('^min) + l 



E (1 -2i + n^+i+ [^-i(z)J)a^+ J2 (l-2i + njv+i + 

j=l «=5(nmin) + l 

9("min) 



^min ) 



2=1 



1 — i + min 

fc=2,...,Ar+l 



1=0 ^1 - 

A; 



a- + E (1 - 2i + riAT+i + n 

«=9("min) + l 



i=l \ 



I + mm 

k=l,...,N+l 



a. 



where (l76l) is from (l75l) and the fact that fig = tin^i, n\ = fii^i, I = 1 
derived from lemma |A2?T1 since p[ = tin+i + no — ni = g{na 



(76) 
(77) 
(78) 

, + 1; dTT]) can be 
and therefore the term min^ 



''^In the above minimization procedure, we ignored the feasibility condition aj > at, Vj > k. A more careful analysis 
reveals that it is always satisfied with the described procedure. 
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in (1771) is dominated by A; > 2 for i < g{n^in) and by /c = 1 for i > g{n^i^, corresponding to 
the two terms in (1761) . respectively. 

2) Case 2 [n^^i G [fio!^i)7-' In this case, we have n'^-^ = nmin and n\ = n^^i. From (I7T1) . 

"min "min / \ 

E{a.\ ol) = ^(rajv+i - i + + I ( j - 1 - ?^7v+i + Cj)aj + ^ (a- - aj)^ j . (79) 

i=l j=l \ i<j / 

Since j — 1 — riTv+i + Cj > 0, Vj < ri^j^, the minimization of E(ct', a) with respect to a is 
in exactly the same manner as in the previous case. Therefore, E{ol') can be obtained from 
Fig. |7(b)| with g{j) in the same form as (I74l) 

'^min 9 ("min) "min 

E{cy.') = ^{nN+i-i + l)ai+ ^ ( [^~^(i)J - + ^ (r^min - 

j=l i=l i=5(nmin)+l 

= E'{cx'). (80) 

Caj'e 5 [un+i G [fii,oo)7.' As in the previous case, we have n'^^^ = rimin and the same 
E{ol\ cx) as defined in (1791) . Without loss of generality, we assume that un+i E [hk*,hk*+i) for 
some k* E [1, N] (we set hjy+i = oo). Then, we have 

h[=ni, for / = (81) 

and 

Pk* < Pfc* < Pk*-i = p'k*-i <■■■ <Pi=p'i- (82) 

Unlike the previous case, j — 1 — n^+i + Cj is not always positive. Let j be the smallest integer 
such that the coefficient j — I — tin+i + Cj of aj in (1791 ) is zero. It is obvious that for j > j, 
a* = a'j. Hence, we have 

n' ■ j — 1 n' . 

mm _ min 

i=l 1=1 j=j 



where the second term is from Fig. |7(c)[ Furthermore, we can show that j < p'^,, since p';^. — 



1 — un+i + Cp'^, = 0. Therefore, we get 

Pfe.-l <in 

^(a') = ^ (1 - 2^ + n;v+i + [9~\^)\) < + (^tv+i - ^ + + ^^^^ 
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Now, we would like to show that the coefficient of a[ in (|83] ) coincides with c[. First, for i < j — 1, 
i E J^,^^ U • ■ ■ U JJy and lemma IA2.1I implies that 

1=0^1-^ 



1 — 2i + riN+i + \ q Vi) I = 1 — i + min 

k=2,...,N+l 



1 — i + min 

k=l,...,N+l 



Ek ~ I 
1=0^1-^ 



Then, for i > p'^,, we have 



Hence, 



i e (Xfc. u ■ ■ ■ u j;) n (Xfc* u ■ ■ ■ u Xi) . 



c' = 1 — z + min 

k=l,...,k* 



1 — i + min 




(84) 



where (f84l) is from (ISTI) and (|82l) . Finally, for i G [j,p'k*)^ let us rewrite i = p'^, 
i-l - un+i + q = 0, Vi G we have 



A,. Since 



J2i=o ni-i- k*nN+i 
k* 



Zlz=o ~ P'k* + ^i- k*nN 



+1 



k* 



A,- 



from which we have Aj G [0, k* — 1] and 



Ya=q ni + riN+i ~ I 



k* + 1 



1 -i 



fc* + 1 
1 + un+i - i. 



1-i 



The proof is complete. 
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C. Proof of Proposition \3.1\ 

Let ol{M) denote the vector of the eigen-exponents of a matrix M as previously defined. To 
prove the first case, we use induction on A^. Suppose that it is true for A^, which means that the 
joint pdf of cxijlgll^g) is the same as that of Q;(nn^). Furthermore, we know by lemma lAl.81 
that OL{Ji.gTN,N+iTN,N+i^'^g) = CKin^np. Same steps as (I67l)(l68l) complete the proof. To prove 
the second statement, we perform a singular value decomposition on the matrices Tj j+i's and 
then apply the first statement. 



Appendix III 
Proof of Theorem [3^ and Theorem [33] 



A. Proof of Theorem \3.2\ 



Let 



(m) A -I • I 

c\ — \ — I + mm 

A:=l,...,m 



1=0 - ^ 



^ = 1, 



(85) 



What we should prove is that cf^^ = cf'\ for i = 1, . . . ,nmin if and only if (fT6l) is true. To 



this end, it is enough to show that 

JN) _ (N-1) 



C 



for i = 1 



, . . . , JT-min 



(86) 



if and only if pn-i ^ N — 1, that is, (A^ — 1) {un + 1) > ^^'^ ^^^^ ^PPly the result 

successively to show the theorem. Note that we need Lemma |A2. II to eliminate the minimization 
in (l85l) . The detailed proof is omitted here. 



B. Proof of Theorem \3.3\ 

The direct part of the theorem is trivial. To show the converse, let h = (no, ni, . . . , un) and 



n 



Q, /t]^, . . . , Uj^ 



be the two concerned minimal forms. In addition, we assume, without 



loss of generality, that 



~ I 



5 ^IM-I- 



n. 
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with iM N and < N' . Now, let us define Cqj = Q — (1 — i) with q defined in (|66l ). It can 
be shown that M intervals are non-trivial with |Xj^| 7^ 0, /c = 1, . . . , M. The values of Cgi s are 
in the following form 



■ ■ ■ 1 '^IM 1 ■ ■ ■ 1 '^IM^ ^ihl ^1 ■ ■ ■ 1 '^IM ^1 ■ ■ ■ 1 f^lM-il ■ ■ ■ 1 f^lM-ll ■ ■ ■ 1 ^2 1) • • • 5 '^l ~l~ I5 ^1 • 
V ' ^ ~v ' ^ . ' 

Same arguments also apply to h with M' and i', etc. It is then not difficult to see that to have 
exactly the same coi's (thus, same q's), we must have N = N' and 



rii = n'i, \fi = 0, . . . ,N, 



that is, the same minimal form. 



Appendix IV 
Proof of Theorem 13.41 

A. Sketch of the Proof 

To prove the theorem, we will first show the following equivalence relations 



{Rr{k),Rf\^,k))^ 




{Rf^ 


Rf\z,k) 


(b) 


Rf\. 


(i?r)(A:),i?f)(iV-l)). 


(± 


{Rf^ 


{R[^\k),Ri'^\'i?> with ordered n) 


id) 


{Rf'> 



{R'^">{k),Ri^^{i) with ordered n); 
{Ri'^'{k), R2^\n — 1) with ordered and minimal n). 

1) Equivalences (a) and (b): The direct parts of (a), (b), and (d) are immediate since the 
RHS are particular cases of the left hand side (LHS). To show the reverse part of (a), we rewrite 

<0,...,n.)(^) = <o-fc,...,n.-*t)(0) (87) 
= {dTno-K..,n.-k)U) + rf(Jn.,,-.,...,n.-.)(0)} (88) 

= f^, {<0,...,n,)(j') + 4'..,„...,n.)(A:)} , (89) 
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where Ri is used twice in (HT]) and ([891); R2 is used in ([88]). As for (b), if Rf\N-l, k) holds, 
then 



dtL..,n.){k) = min + C„^)(A;)| (90) 



= |<0,...,n._.)(j') + + dfln.)ik)\ (91) 

= {<o,...,n.-.)(/) + ^S^,n^-.n.)(^)} (92) 

which proves rI^\n — 2,k). By continuing the process, we can show that Rl^\i, k) is true 
for all i, provided rI^\n — 1, k) holds. 

2) Equivalences (c) and (d): Through (a) and (6), one can verify that the LHS of (c) is 
equivalent to the RHS of (a) of which the RHS of (c) is a particular case. Hence, the direct 
part of (c) is shown. The reverse part of (c) can be proved by induction on A^. For N = 2, 
can be shown explicitly using the direct characterization (fT3l) . Now, assuming that 
Ri^\N-l) for non-ordered n, we would like to show that i?^+^(A^) holds. Let us write 

gnj4|^^...,,^_^,,^^^^...,,^)(fc) + 4^^^^)(jO + rfj?,0(0)} 

(94) 

(95) 



= mi 

k>j 



mm 

k>j'>0 



= ^S'o,...,niv+i)(0)' 

where the permutation invariance property is used in (|93l) : rI^\n — 1, /c) is used in (|94l ) since 
we assume that R2^\N—1) is trues; hi and fiAr+i can be permuted according to r'^\i). Finally, 
we should prove the reverse part of (d), i.e., 

<,...,..)(0) = f^^{dfl,...,^^.,)ij)+jnN} (96) 

provided that R2^\n — 1) holds for minimal n. 

If n is not minimal, then showing (c) is equivalent to showing 

C...,^.0(0) = nnn{rf^„,...,,^.)(j) + J^^} , (97) 
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where A^* is the order of n with fij^^^i < hj^. Therefore, we should show that the minimum is 
achieved with j = 0. According the direct characterization (fT3l) . this is true only when njv > Ci. 
Let us rewrite Ci as 

' N* 

N*nN*+i +Pn* 
N* 

Since pj\f* > N* is always true according to the reduction theorem, we have ci < n^v+i < n^. 
The rest of this section is devoted to proving that (|96l) holds for minimal n. 



B. Minimal n 

Now, we restrict ourselves in the case of minimal and ordered n, i.e., we would like to prove 

^S,...,.,.)(0) = f^^{dfl,...,n,,.,^ij)+jnN} . (98) 



Since Cp^.^^ < tin*, the optimal j is in the interval 1^* = [l,pAr._i]. Now, showing (|98]) is 
equivalent to showing 



Pn*-i 



El 



i=l 



N* 



min > 1 

Piv._i>i>o 



22l=0 ^1 

N* - : 



which, after some simple manipulations, is reduced to 



Pm 

E 

i=l 
A 





i - 1 


(^i - Pm + 


_M+ 1_ 



mm 

k 



i=l 



^[i-PM + 



i - 1 

M 



(99) 



where we set M = N* — 1 for simplicity of notation. Obviously, the minimum of the RHS of 
(l99l) is achieved with such k* that 

k* - 1 



k* - Pm + 



M 



k* 



M 



<0, 
> 0. 



(100) 
(101) 



and {k* + 1) - Pm + 

Let us decompose k* as k* = aM + b with b e [1, M]. Then, (|100l) becomes 

aM + b-pM + a<0 (102) 

which also implies that aN +1— pA/ + a < from which a = [^^^J • The form of a suggests 
that Pm can be decomposed as 

Pm = a{M + l) + b. (103) 
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From (11021) and (11031 ). we have b < b and thus b = min |M, &}. With the form of optimal k 
and some basic manipulations, we have finally 



Pm 

E 

i=l 

which ends the proof. 



i-PM + 



M+1 



k* 

E 

i=l 



i-PM + 



M 



Appendix V 
Proof of Lemmas [4T| and [4?2] 



A. Proof of Lemma \4.1\ 
First, we have 



SmX^^,{WH) < SNR \\H\\l < det(I + SNRH^H) 



from which 



(104) 



(105) 



P {SHRX^^,{WH) < 1 + e} > P {det(I + SNRWH) < 1 + e} 
with e being some strictly positive constant. Then, we also have 

P{SmX^^,{WH) < 1 + e} < P {det(I + SmWH) < (2 + e)''""''^^)} , 
since det(I + SNRWH) = + Sm\i{WH)). From (fT04l) and (fTOSl) . we have 
P {SmX,^^^{WH) < 1 + e} = P {det(I + SNRWH) < 1 + e'} 

where e' is another strictly positive constant. Hence, P{SNR||ff||p<l + e}= SNR"''^"^ The 
lemma is proved since P {SNR \\H\\l < 1 + e} = P {SNR \\H\\l < l}. 



B. Proof of Lemma \4.2\ 

Let us consider a parallel channel {Hk}k=i, each sub-channel of rank Mk and with eigen- 
exponents {ai,fc,«2,fc, • • • ,«A4,fc}- Since each sub-channel is an AF path, the joint pdf of the 
eigen-exponents in the high SNR regime is pk{ak) = SNR^^^'^' *"'*^. From Lemma lAl.ll the 
DMT is 

dp{r) = min V'V'Q^fcai,^ 
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with 0{r) = {J2k ~ ctj.fc)^ < Kr} being the outage region. First, we can deduce that 

k i 

= 5^4(0). 

k 

Then, if all AF paths have the same DMT, they have the same set {cj^jj, i.e., = q, VA;. 
We can verify that setting ai^k = ai, y k is without loss of optimality, since 1) the objective 
function is linear and symmetric on different k, and 2) the constraints are convex and symmetric 
on different k. Finally, the optimization problem becomes 

min K } Citti 

aeOo(r) 

'I 

with Oo{r) = ~ Q^i)^ ^ ^} is the outage region of each single AF path. The lemma can 

be proved immediately from here. 

Appendix VI 
Other Proofs 

A. Proof of Proposition \4.2\ 

Without loss of generality, we assume that uq > n2. Then, the bottleneck of the channel is the 
rii X n2 channel. Since the partition achieves the maximum diversity, by theorem l4n the partition 
size is = K1K2 with the ni (respectively, 722) antennas being partitioned into Ki (respectively, 
K2) supernodes. Moreover, for any AF path k in the partition, we have + 1 > '^fc.i + ^fc,2- 
Adding all the K inequalities up gives 

K 

J2 ^fc,o + K1K2 > K2ni + Kin2. (106) 

k=l 

The sum in the LHS of (11061) can be upper-bounded by KiUq, since each supernode in the 
transmitter cannot be connected to more than Ki nodes. Hence, we have the following inequality 
after some simple manipulations 

r K2n, 
K2 + no- n2 

from which we have the lower bound on the partition size 

K2ni 



K1K2 > K2 



K2 + no - n2 
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which is obviously increasing with K2. Therefore, the minimum lower bound is obtained by 
setting K2 = I and it coincides with (|29l ). It can be shown that this lower bound is achieved by 
partitioning the intermediate layer into K supernodes with K defined by (|29l ) without partitioning 
either of the source and the destination antennas. 



B. Proof of Theorem \4.2\ 

Let us define the selection matrices Ji^s as rii x rii diagonal matrices with 

1 if j G Si^k, 



otherwise. 



First, we would like to prove that the maximum diversity gain is achieved. This can be done in two 
steps. The first step is to prove that the parallel channel {11'^}^ with 11'^ = YliLi^ {•^iJi{k)Hi) 
achieves the maximum diversity. To this end, note that by partitioning the rows (respectively, 
columns) of (respectively. Hi) according to the indices in Sj^^i, . . . ,Sn^Kn (respectively, 
5o,i, . . . , Sq^Ki), the matrix Hi can be partitioned into KqK^ blocks, each one being an AF path 
from the source to the destination. Therefore, {11'^!}^ comprises KqKi ■ ■ ■ AF paths, i.e., all 
possible paths. Obviously, these paths include the K independent paths {11,^.}^ in the independent 
partition. Therefore, the maximum diversity is achieved since J2k=i ll-'^fcllr — '^k=i W^kWl ■ 

The key of the second step is to show that the set of matrices {11), defined in (|30l ) is actually 
an invertible constant linear transformation of {11'^'}^, i.e.. 



n 



K' 



n 



K' 



In this case, we have 



K' 



K' 



ElWllF>^-in(TTO$^||nl' 



k=l 



k=l 



K' 



= EI|n: 



fc 1 1 P 



k=l 



and the diversity is lower-bounded by the maximum diversity, according to Lemma 14. 2[ Hence, 
the FF scheme also achieves the maximum diversity. The key point is shown in the following. 
First, let us divide the set of indices {1, . . . , K'} into K' / Ki groups, each one comprising exactly 
Ki integers ii, . . . ,iK-^ such that fj{ii) = . . . = fj{iKi), Vj = 2, . . . , — 1, and fi{ij) varies 
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from 1 to i^i. Then, we partition the set {11';, according to the partition of the indices described 
above. Hence, the matrices in the same group can be rewritten as {GFi qHi, . . . .GFi k^Hi} 
with G being some matrix. We have 



GFiiHi 



GFiK^Hi 



GJiiHi 



GJ i^KiHi 



where Ti is composed of Ki x Ki blocks of matrices with the (i,j)-th block being —I if 
i = j > 2 and I otherwise. We can verify that Ti is invertible and with the transformation, 
the matrices Fi^s are replaced by Ji^^'s with the same indices. In the same manner, we can 
successively replace the matrices i^2,fc, • • • j-F'Af-i,*; with J2,k, ■ ■ ■ ,JN-i,k by similar invertible 
transformations T2, ■ ■ ■ ,T]\f_i as Ti. Finally, we obtain {11'^}^ and the total transformation is 
invertible, constant and linear. 

Note that the parallel channel of the FF scheme is in outage for a target rate i^VlogSNR 
implies that at least one of the sub-channels is in outage for a target rate rlogSNR. Therefore, 
one can show that SNR"'^''''^''^ < SNR^''^''^''^ from which (f^{r) > d^^{r). Finally, by showing 
that (F^{r) is piece-wise linear with K' ho sections, we prove the theorem. 



C. Proof of Theorem 15.21 

Let A(M) and ol{M) denote the vector of the ordered eigenvalues and the corresponding 
eigen-exponents of a matrix M . The theorem can be proved by showing a stronger result : the 
asymptotical pdf of Q;(nppnpF) in the high SNR regime is identical to that of o: (11^11). We 
show it by induction on A^. For = 1, since = Hi, the result is direct. Suppose that the 
theorem holds for A^. Let us show that it also holds for + L Note that IIpp = H_j^^iP j^Upy; = 
ILn+i^nQj^^pf, from which we have 

for a given 11. Similarly, n'^II' ~ Wn„,i„(niv+i, A(n^n)). At high SNR, we can show that 

a((^^Q;npp)^(^^Q;npp)) = a((Q;npp)^(Q;npp)) 

= a(n;pnpp), 
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where the first equality comes from lemma |AL8] and the second one holds because {Q''^IlpFy{Q''j^Tlpp) = 
XIppIIpp. Finally, since we suppose that the joint pdf of Q:((npp)npF) is the same as that of 
(x{Wll), we can draw the same conclusion for Q;((npp)^npp) and Q:((n')^n'). 

D. Proof of Theorem 1(5.71 

Let us consider an equivalent block-diagonal channel of the parallel channel (|43|) in the 
following form 

= diag(nfc) Xe + Ze, (107) 

where Xe = \xi^ X2 ■■■ ^a'^, and 2/e,2e are defined in the same manner. Now, from the 
parallel NVD code X, we can build a block-diagonal code with codewords defined by 
-^BD — diagjXfc}. We can verify that Xbd is actually a rate-riav NVD code defined in [34] with 
^av — Z]fc^t,fe/-^- From [34, Th. 3], we have 

\ ^av / 

= d{Kr), 

where d(r) is the DMT of the parallel channel (and thus the block-diagonal channel). Finally, it is 
obvious that dx{K r) = dx^oiT^)^ since using X will have the same error performance^ as using 
r^BD except that the transmission rate is K times higher. We have thus dx{r) > d{r). It is shown 
in [34] that the achievability holds for any fading statistics. Thus, the code is approximately 
universal. 

E. An Alternative Code Construction 

A simple alternative construction that is approximately universal is described as follows. Let 
Afuii be a risum x T full rate NVD code with n^„^ = n^^k and T > risum- Then, Afun achieves 
the DMT d{r) of the channel (11071) . It means that by partitioning every codeword matrix Xfuii G 
Atuii into K X 1 blocks in such a way that the /cth block is of size rit.fc x T and sending 
the kth block in the kih. sub-channel, the DMT of the original parallel channel is achieved. 
Although this construction is simple and suitable for both symmetric and asymmetric channels, 
the main drawback is that the coding delay is roughly K times larger than the parallel NVD 
code constructed in Section IVI-CI Decoding complexity of such codes is sometimes prohibitive. 

''This is due to the block-diagonal nature of the equivalent channel. 
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F. (^2™ is not a norm in K 

Assume that ^2™ is a norm in K, which means 

3x G K, A/'iK/Q(C2^)(x) = C2™. (108) 

Consider now the extensions described in Fig. [6] with the proper fields. From (11081) and the left 
extension of Fig.O we deduce that A^K/(Q(i)(x) = A^Q(C2m)/Q(i) {N^/m2m){x)) = since the 
minimal polynomial of is X^"" ^ — i. Meanwhile, from the right extension of Fig.[6l we 
have Nk/q(,){x) = N^^.^^y^^^^ (^k/q(*,v^) (^)) = Denote y = A^^/^^ . (a;) G Q (z, Vs). 
Then the number z = ^-y^ y has an algebraic norm equal to i, and belongs to Q (i, v^) which 
is in contradiction with the result obtained in [48]. So, (^2™ is a non-norm element. 
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(a) Outage probability 
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(3, 1, 4, 2) with 3x3 DAST — ■- 
(3, 1, 2, 2) with 3x3 DAST — B- 
(2, 1,2,2) with 2 x 2 DAST 
f2. 1.2.21 with Alamouti 
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(b) Symbol error rate 

Fig. 8. Vertical reduction : target data rate 2 bits per channel use in the outage performances or 4-QAM constellation in the 
coded cases. 
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Fig. 9. AF vs. AFF : target data rate 4 bits per channel use in the outage performances or 4-QAM constellation in the coded 
cases. 
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£b/Wn(dB) 

Fig. 10. The (3, 1, 4, 2) multihop channel : outage probability of the serial partition with various numbers of decoding clusters, 
target data rate 2 bits per channel use. 
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